The dirac function is defined as $\delta(x)=\infty$ when $x=0$, $\delta(x)=0$ otherwise. I am wondering why we can derive $\int_{-\infty}^{\infty}\delta(x)\ dx=1$, or this is just a definition
-
6The dirac distribution isn't really a function (it's a bad misnomer), so you can treat it as a definition. – Adam Hughes Jan 07 '15 at 21:23
-
1@AdamHughes we always use it, but up till now I find that it is bad, since how can we say the function defined like that has integration is 1 but not 2? – 89085731 Jan 07 '15 at 21:27
-
2Exactly why it's best not to think of it like a function. It's really an atomic probability measure on $\Bbb R$ with mass $1$ at $0$ and none elsewhere. There is another atomic measure with mass $2$ which is twice the dirac, yes, but we are singling out this one. – Adam Hughes Jan 07 '15 at 21:33
5 Answers
That isn't how dirac is defined; that is how it is conceptualized.
The definition of dirac delta distribution is that it is a distribution with the property:
$$\int_{-\infty}^{\infty} \delta(x - k)f(x) {\rm d}x = f(k)$$
So if you take the case $k = 0$ and $f(x) = 1$ then you get your result.
The conceptualization that $\delta(x) = \begin{cases} \infty \text{ for } x=0\\ 0 \text{ for } x \ne 0 \end{cases}$ is very useful in disciplines like electrical engineering and DSP where it is sometimes called a "spike" and approximations of it can be seen on an oscilloscope. But due to ambiguity, "$\infty$" is almost never used as an object just for the reason you state, it wouldn't have properties that let us solve the integral.
- 23,556
-
6
-
-
8The problem with the integral sign is that people thing that you are talking about an integral, and you are not. As least until people are sufficiently familiar with what distributions are (and the OP is not), I think that writing $\langle \delta,f\rangle$ or other similar notation is best. (Even if you want to look at $\delta$ as a measure, it is best to write the integral differently. – Mariano Suárez-Álvarez Jan 07 '15 at 21:45
-
1@MarianoSuárez-Alvarez It seems to me (admittedly only taking a cursory look) that treating it as an integral would be perfectly consistent, since you are defining a new object type (distribution) and how it is to be treated as an integral. For example, you don't have to redefine the concept of integral when $\mathbb R$ is generalized to $\mathbb C$, or other types of objects for which $+$ and scalar multiplication is defined. – DanielV Jan 07 '15 at 21:57
-
4@DanielV It's very confusing to see what you wrote because the Dirac delta isn't a function and you haven't even defined integrals for things that aren't functions. If one sees an integral one would assume it is the thing that they are already familiar with especially if not otherwise mentioned. – JLA Jan 07 '15 at 22:02
-
4It is consistent, but coalescing the notations from the start can lead to problems. Distributions are simply not functions, there are many things that one can do to integrands that simply do not make any sense with $\delta$ and when starting out using the same notation for two things requires extra vigilance. Notation can be useful as a constant reminder of the difference. – Mariano Suárez-Álvarez Jan 07 '15 at 22:05
-
One nice thing about the integration notation is that it does give us certain nice, consistent extensions. For example, the definition $\delta' = f \mapsto -f'(0)$ can be informally justified by integration by parts. – Ian Jan 07 '15 at 23:38
-
The Dirac delta is a distribution of order $0$, and so can be viewed as a measure. A good notation in order not to make mistakes is $$\int f(x) ,\delta_k(\mathrm d x) = f(k)$$ – LL 3.14 Jul 07 '20 at 17:56
The idea that "$\delta(0)=\infty$ and $\delta(x)=0$ otherwise" is not a definition of the Dirac delta function, it is just a convenient graphical intuition. Although there is a sense in which one can approximate $\delta$ by ever-sharper ever-longer needle curves jutting out at $x=0$. I'll explain momentarily.
Often in life we have weighted averages. For instance, my grade in my history class was comprised of three parts: attendance was 25 points, midterm was 100 points, and final exam 125 points, for a total of 250 points. Thus, if my scores on attendance, midterm and final are $A,B,C$ then my final grade can be computed as the weighted average $\frac{25}{250}A+\frac{100}{250}B+\frac{125}{250}C$. The coefficients of the unknowns are the weights - they specify to what extent each score contributes to my final grade.
Another example is mass and density. Volume of a region in space is just the volume integral over the region, $\int_RdV$, in which every point in space contributes equally to the measure. But mass is different, and is $\int_R\rho dV$, where $\rho$ is the density at a particular point in space. The center of mass of an object of uniform density would be $\int_R\vec{x}dV$, i.e. the vectorial "average" of the points within the region. But if the object does not have uniform density, then the center is $\int_R\rho\vec{x}dV$, so that how much a point contributes to the vectorial average is proportional to the density at that point in space. If an object is lopsided with higher density in one part of the space it occupies then the center of mass will be more biased towards that direction than it otherwise would be.
Other examples exist too. If $X$ is a random variable, how do we compute the expected value of the new random variable $f(X)$ using $X$'s probability distribution? Same idea: $\int f(x)p_X(x)dx$, an average of $f$'s values weighted by the probability of the corresponding inputs. Many times in physics and PDEs, integral transforms weigh functions against kernels to get new functions, generalizing matrix multiplication (if we think of a function as a coordinate vector, where each possible input is an index and the output is the coordinate at that index). For instance, heat kernels and Fourier methods use integral transforms. (Delta is also used this way; see Green functions.)
The takeaway is this: sometimes the use of a "weighing" function is in integrating other functions against it. The act of weighing functions $f$ against a given weight function $w$, i.e. the assignment given by $f\mapsto \int wf$, is a linear map from the space of (suitable) functions to scalars, which inspires the idea: what if we speak more generally of such linear functionals, and not necessarily ones that can be obtained by integrating $f$ against a weight function? These are distributions, also known as "generalized functions."
Dirac delta is the distribution $f\mapsto f(0)$, which is obviously linear. For convenience, even though this distribution cannot be obtained by a bona fide function, we "pretend" (notationally, at least) that it does and write $\int f(x)\delta(x)dx=f(0)$. There is of course no function with this property, and the notation is tricky because, as Mariano says, there are many things one can do to integrands that one cannot do to this make-believe integrand $\delta(x)$, and in any case it inevitably confuses newcomers. And yet there are many manipulations of $\delta$ that work even though it's not an integrand, or even give us more power. For instance, by invoking by-parts integration, we can speak of so-called weak solutions to PDEs by transforming PDEs into (logically weaker) integral equations and then reinterpreting "integrating function against" as "applying distribution to."
There are certainly families of "spike" functions $\delta_\epsilon(x)$ which grow an ever thinner and longer spike at $x=0$ (as $\epsilon\to0^+$) so that, while $\lim_{\epsilon\to0^+}\delta_\epsilon(x)$ would converge pointwise to a function which is $0$ for all $x\ne0$ but not defined at $x=0$, nonetheless $f(0)=\lim_{\epsilon\to0^+}\int \delta_a(x)f(x)dx$ (note the limit is on the outside of the integral, not on the inside). This justifies the intuition that $\delta$ is $0$ outside of $x=0$ and an infinite spike at $x=0$. Wikipedia provides the following example:
$\hskip 2.3in$ 
These are the functions $\delta_\epsilon(x)=e^{-(x/\epsilon)^2}/(\epsilon\sqrt{\pi})$. In fact, each is a probability distribution, so $\delta(x)$ is a certain "weak limit" of probability distributions. Anyway, because of this infinite spike nature of $\delta$, it is used in physics to model impulses; this is essentially an idealization in which we let the region of space a pulse acts on tend to a zero-dimensional point, forcing us to up the amplitude to infinity along the way as compensation, so the resulting dynamics approach a meaningful limit.
- 151,657
This is a conventional notation, as this integral does not exist.
Anyway, if it did, it could be evaluated as the difference of its antiderivative at $\pm\infty$. This hypothetic antiderivative would be constant everywhere (zero derivative), except with a discontinuity at $0$ (infinite derivative).
Such a function is known as the Heaviside step, defined as $0$ for negatives and $1$ for positives.
$$\int_{-\infty}^{\infty}\delta(x)\ dx=h(x)\Big|_{-\infty}^{\infty}=1.$$
A $\delta$ function such as this integral would yield $0$ or $\infty$ would be of little interest as it would generate degenerate equalities. And for the sake of simplicity, the area is defined to be $1$ (in principle, it can be attributed any value, as scaling it has no effect).
The $\delta$ function is often presented as the limit of a peak function becoming narrower and narrower, that acts as an averaging weight. A weighted average is obtained as $$\overline f_w=\frac{\int_{-\infty}^{\infty}w(x)f(x)\ dx}{\int_{-\infty}^{\infty}w(x)\ dx}.$$ The weights are said to be normalized when $$\int_{-\infty}^{\infty}w(x)\ dx=1,$$ and the average reduces to $$\overline f_w=\int_{-\infty}^{\infty}w(x)f(x)\ dx.$$ You can see $\delta$ as a perfectly concentrated weighting function (all weight at $0$), which is normalized, so that $$\overline f_\delta=\int_{-\infty}^{\infty}\delta(x)f(x)\ dx=f(0).$$
Let $X = C^0(\mathbb{R}) \cap L^2(\mathbb{R})$ be the space of continuous functions $f:\mathbb{R} \to \mathbb{R}$ with $\int_{-\infty}^\infty |f|^2< \infty$. (We can also work with the space of compactly supported $f$, etc.) Consider the continuous linear map $L:X \to \mathbb{C}$ defined by $L(f) = f(0)$. By the Riesz representation theorem, any continuous linear map $H\to \mathbb{R}$ with $H$ a Hilbert space must be of the form $x \to (x, y)$ for some fixed $y\in H$. Our space $X$ is not a Hilbert space; it's just a vector space with inner product $(f, g) = \int_{-\infty}^\infty fg$, under which it is not complete. If it were complete, though, we could write $ \int_{-\infty}^\infty f\delta = L(f) = f(0)$ for some function $\delta\in X$.
That's what the $\delta$ function is: It's the functional $f \to f(0)$, written as $$f \to \int_{-\infty}^\infty f(x) \delta(x)\, dx = f(0)$$ by the analogy above. In particular, taking $f\equiv 1$ gives $\int_{-\infty}^\infty \delta = 1$. There certainly isn't a continuous function $\delta$ that satisfies the previous equation. On the other hand, we can for example define bump functions $$ h_n(x) = \begin{cases} n & \text{if $x\in [-\frac{1}{2n}, \frac{1}{2n}]$}; \\ 0 & \text{otherwise} \end{cases}$$ and note that $$\lim_{n\to\infty} \int_{-\infty}^\infty f(x)h_n(x)\, dx = f(0) = \int_{-\infty}^\infty f(x) \delta(x)\, dx$$ for reasonable functions $f$. We can therefore vaguely identify $\delta$ with $\lim_{n\to \infty} h_n$ and make statements like $$\delta(x) = \lim_{n\to\infty} h_n(x) = \begin{cases} \infty & \text{if $x = 0$}; \\ 0 & \text{otherwise}. \end{cases}$$ Of course, $\delta$ isn't even a function (nor are the $h_n$ uniquely defined), and so the previous equation doesn't make any sense. It's occasionally a convenient of thinking about the issue, though, in the same way that a derivative $dy/dx$ is not literally a quotient but often behaves as if it were.
- 25,364
- 5
- 44
- 85
-
1This response might get a good grade on a test, but it is not useful to a person asking such a question. – DanielV Jan 07 '15 at 22:49
-
I wouldn't say it isn't useful. The point is that the definition given ($\delta(0) = \infty$ and $\delta = 0$ elsewhere) is not correct; I tried to explain what the correct definition is and why that incorrect definition often pops up in this context. – anomaly Jan 07 '15 at 23:35
Probably one of the best ways to approach this problem is to use measure theoretic definition. That is $$\delta_{x_0}(A)=\cases{1,\quad x_0\in A\\0\quad x_0\not\in A }$$ Summing over all the real line $$P(A)=\int_{\mathbb{R}}\delta_{x_0}(A)\mathrm{d}x_0$$ and letting $A=\mathbb{R}$ leads to $P((-\infty,\infty))=1$.
- 7,848