[HN Gopher] Tensors, the geometric tool that solved Einstein's r...
___________________________________________________________________
Tensors, the geometric tool that solved Einstein's relativity
problem
Author : Luc
Score : 70 points
Date : 2024-08-12 15:11 UTC (7 hours ago)
(HTM) web link (www.quantamagazine.org)
(TXT) w3m dump (www.quantamagazine.org)
| senderista wrote:
| If you have any linear algebra background, then the definition of
| a tensor is straightforward: given a vector space _V_ over a
| field _K_ (in physics, _K_ = _R_ or _C_ ), a tensor _T_ is a
| multilinear (i.e. linear in each argument) function from vectors
| and dual vectors in _V_ to numbers in _K_. That 's it! A type
| _(p, q)_ tensor _T_ takes _p_ vectors and _q_ dual vectors as
| arguments ( _p+q_ is often called the _rank_ of _T_ but is
| ambiguous compared to the type).
|
| (If you're unfamiliar with the definition of dual vector, it's
| even simpler: it's just a linear function from _V_ to _K_.)
| will-burner wrote:
| The definition may be simple, but it's not very concrete and
| I'd argue that makes it not strait forward. While examples of
| vector spaces can be very concrete (think R, R^2, R^30), I
| struggle to think of a concrete example of a multilinear
| function from vectors and dual vectors in V to numbers in K. On
| top of that when working with tensors, you don't usually use
| the definition os a multilinear function at least as far as I
| remember.
| tel wrote:
| Not really to push back as I do agree that this is a bit
| trickier to get an intuition for than the OP suggests, but
| the most trivial concrete example of a (1, 1) tensor would
| just be the evaluation function (v, f) |-> f(v), which, given
| a metric, corresponds to the inner product.
| cshimmin wrote:
| A simple example of a multilinear function is the inner
| (a.k.a dot) product <a, b>: it takes a vector (b), and a dual
| vector (a^T), and returns a number. In tensor notation it's
| typically written d_ij.
|
| It's multilinear because it's linear in each of its arguments
| separately: <ca, b> = c<a,b> and <a, cb> = c<a,b>.
|
| Another simple but less obvious example is a rotation
| (orthogonal) matrix. It takes a vector as an input, and
| returns a vector. But a vector itself can be thought of as a
| linear function that takes a dual vector and returns a number
| (via the inner product, above!). So, applying the rotation
| matrix to a vector is a sort of "currying" on the multilinear
| map, while the matrix alone can be considered a function that
| takes a vector and a dual vector, and returns a number.
|
| In functional notation, you can consider your rotation matrix
| to be a function (V x V*) -> K, which can in turn be
| considered a function V -> (V* -> K), where V* is the dual
| space of V.
| senderista wrote:
| I think you're describing the evaluation map T(v, w) =
| w(v), which has type (1,1), rather than the inner product,
| which has type (2,0). The inner product lets you "raise and
| lower indices" (i.e. convert between vectors and dual
| vectors), so you can basically pretend that it is the
| evaluation map.
| klodolph wrote:
| I think part of this is "if you have a linear algebra
| background". There are a few different explanations of
| tensors, and different explanations make sense for different
| people.
| adrian_b wrote:
| In physics, the first and even now the most important
| application of multilinear functions, a.k.a. tensors, is in
| the properties of anisotropic solids.
|
| A solid can be anisotropic, i.e. with properties that depend
| on the direction, either because it is crystalline or because
| there are certain external influences, like a force or an
| electric field or a magnetic field that are applied in a
| certain direction.
|
| In (linear) anisotropic solids, a vector property that
| depends on another vector property is no longer collinear
| with the source, but it has another direction, so the output
| vector is a bilinear function of the input vector and of the
| crystal orientation, i.e. it is obtained by the
| multiplication with a matrix. This happens for various
| mechanical, optical, electric or magnetic properties.
|
| When there are more complex effects, which connect properties
| from different domains, like piezoelectricity, which connects
| electric properties with mechanical properties, then the
| matrices that describe vector transformations, a.k.a. tensors
| of the second order, may depend on other such tensors of the
| second order, so the corresponding dependence is described by
| a tensor of the fourth order.
|
| So the tensors really appear in physics as multilinear
| functions, which compute the answers to questions like "if I
| apply a voltage on the electrodes deposited on a crystal in
| this positions, which will be the direction and magnitude of
| the displacements of certain parts of the crystal". While in
| isotropic media you can have relationships between vectors
| that are described by scalars and relationships between
| scalars that are also described by scalars, the corresponding
| relationships for anisotropic media become much more
| complicated and the simple scalars are replaced everywhere by
| tensors of various orders.
|
| What in an isotropic medium is a simple proportionality
| becomes a multilinear function in an anisotropic medium.
|
| The distinction between vectors and dual vectors appears only
| when the coordinate system does not use orthogonal axes,
| which makes all computations much more complicated.
|
| The anisotropic solids have become extremely important in
| modern technology. All the high-performance semiconductor
| devices are made with anisotropic semiconductor crystals.
| AnotherGoodName wrote:
| For those without a strong math background but more of a
| programmers background;
|
| You know the matrices you work with in 2D or 3D graphics
| environments that you can apply to vectors or even other
| matrices to more easily transform (rotate, translate, scale)?
|
| Well tensors are the generalisation of this concept. If you've
| noticed 2D games transformation matrices seem similar (although
| much simpler) to 3D games transformation mateices you've
| probably wondered what it'd look like for a 4D spacetime or
| even more complex scenarios. Well you've now started thinking
| about tensors.
| andrewla wrote:
| Slight disagree here -- matrices are enough for
| transformations in 2, 3, 4, and 100 dimensions. Tensors are
| not arrays with more rows and columns; they are higher
| dimensional objects -- more indices, not greater range of
| indices.
| SkyBelow wrote:
| Is a tensor higher dimension, or is it the generalized form
| of the structure encompassing all of it (individual numbers
| (0 dimensions), vectors (1 dimension), matrixes (2
| dimensions), and so on)? Kind of like how an n-sphere
| describes circles and spheres for n equal to 2 or 3.
| andrewla wrote:
| The problem is that I'm using dimensional in two
| different senses. A vector with N elements, for example,
| is a one-dimensional array of numbers operating on N
| dimensional spaces, and so on for matrices and tensors.
|
| If you're doing graphics programming you're operating in
| three and four dimensional spaces mostly (four
| dimensional being projective spaces), because that's the
| space you're trying to render in two dimensions. But
| you'll rarely need anything higher than a matrix (a two-
| dimensional data structure) for operations on that space.
|
| If you're doing physics you're operating in three, four,
| and infinite dimensional spaces, mostly. And you'll
| routinely use higher data structures -- even things like
| moment of inertia for rigid bodies can't really be
| described without rank 3 tensors (a three-dimensional
| data structure).
|
| In statistics and machine learning, you're operating in
| very high dimensional spaces, and will find yourself
| using non-square tensors especially (in the other areas
| everything will be square). The data structures will
| generally be high dimensional as well, but usually just a
| function of model complexity; so maybe 4 or 5 dimensional
| data structures.
| enugu wrote:
| Any given tensor has a type (p,q). Say d=p+q. Then d is
| the 'dimension' of the tensor in the sense that you would
| need nxnx...n=n^d numbers (think of an nxnx...n array) to
| describe the tensor where n is the dimension of the
| underlying vector space.
|
| > or is it the generalized form of the structure
| encompassing all of it...Kind of like how an n-sphere
|
| If you are asking whether there are examples of tensors
| parametrized by an integer d, you can cook up examples -
| like the (d,0) tensor whose input is a sequence of d
| vectors and just adds up all the components with respect
| to some basis in each slot and then adds this number
| across all the d slots.
|
| But just like a n-spere is a special example, of a
| polynomial in n variables, the above tensor is a specific
| example - it is the tensor where all the components are 1
| in the higher dimensional array.
|
| Sometimes, one considers the algebra of tensors across
| all dimensions like the symmetric algebra or exterior
| algebra simultaneously (where there is multiplication
| operation between tensors of different dimensions), but
| that might not be what you were asking about.
| thechao wrote:
| > (individual numbers (0 dimensions), vectors (1
| dimension), matrixes (2 dimensions), and so on)
|
| You're using the word "dimension" in two (distinct) ways.
| Instead, use the word "rank":
|
| > (individual numbers (0 rank), vectors (1 rank),
| matrixes (2 rank), and so on)
|
| Now, we can talk about a 4-dimensional rank-1 tensor,
| e.g., a 4-element vector.
|
| Now, think about a 4x4 matrix: if we multiply the matrix
| by a 4-vector, we get a 4-vector out: in some ways, the
| multiplication has "eaten" one of the _ranks_ of the
| matrix; but, the dimension of the resulting object is the
| same. If we had a 3x2 matrix, and we multiplied it by a
| 3-vector, then both the rank has changed (from 2 to 1)
| _and_ the dimension has changed (from 3 to 2).
|
| A tensor has any number of rank.
|
| More importantly, the ranks of a tensor come in two
| "flavors": a vector and a one-form. The concepts are
| pretty darn general, but one way to get a feel for how
| they're related is that the transpose of a vector can be
| its dual. This gets into things like pre- and post-
| multiplication; or, whether we 'covary' or 'contravary'
| with respect to the tensor.
|
| Frankly, tensor products are a beast to deal with,
| mechanically, so the literature mostly deals with them as
| opaque objects. The modern tensor software libraries and
| high performance computing has seen a sea-change in the
| use of GR.
| mitthrowaway2 wrote:
| Is there a difference between a 4x4 matrix and a 4x4 tensor?
| AnotherGoodName wrote:
| A matrix is a subset of a tensor and the 4x4 matrix is
| absolutely a tensor. Tensors can be way more complex and do
| more but the 4x4 matrix you use in 3D operations is a great
| starting point.
| tel wrote:
| In short, tensors generalize matrices. While we can
| probably guess accurately what "4x4 matrix" means, "4x4
| tensor" is missing some information to really nail down
| what it means.
|
| Interestingly, that extra information helps us to
| differentiate between the same matrix being used in
| different "roles". For instance, if you have a 4x4 matrix
| A, you might think of it like a linear transformation.
| Given x: V = R^4 and y = Ax, then y is another vector in V.
| Alternatively, you might think of it like a quadratic form.
| Given two vectors x, y: V, the value xAy is a real number.
|
| In linear algebra, we like to represent both of those
| operations as a matrix. On the other hand, those are
| _different_ tensors. The first would be a rank-(1,1)
| tensor, the second a rank-(2, 0) tensor.
|
| Ultimately, we might write down both of those tensors with
| the same 4x4 array of 16 numbers that we use to represent
| 4x4 matrices, but in the sort of math where all these
| subtle differences start to really matter there are
| additional rules constraining how rank-(1, 1) tensors are
| distinct from rank-(2, 0) tensors.
| pyinstallwoes wrote:
| Is there a similar "gimbal lock" problem that people ran
| into?
| jesuslop wrote:
| To add a bit, kudos to root-parent boil-down. Programmers
| have already the good representation, and call it
| n-dimensional array, that being a list of lists of lists ...
| (repeat n times) ... of lists of numbers. The only nuisance
| is that what programmers call dimension, math people call it
| rank. It is the sizes of those nested lists what math people
| call dimensions. It's all set up for a comedy of errors. Also
| in math the rank is split to make explicit how much of the
| rank-many arguments are vectors and how many are dual
| vectors. You'd say something like this is a rank 7 tensor, 3
| times covariant (3 vector arguments) and 4 times
| contravariant (4 dual vector arguments) summing 7 total
| arguments. I'm assuming a fixed base, so root-parent map
| determines a number array.
| paulpauper wrote:
| this is still not it. 4d would simply be a 4x4 matrix instead
| of 3x3
|
| tensors are something which no one has been able to fully or
| adequately describe. I think you simply have to treat them as
| a set of operations and not try to map or force them unto
| existing concepts like linear algebra or matrices. they are
| similar but otherwise something completely different.
| g15jv2dp wrote:
| That's incorrect if V is infinite-dimensional. A (0,1)-tensor
| is just supposed to be an element of V but with your definition
| you get an element of the bidual of V. Which is not isomorphic
| to V when dim V is infinite. And even when dim V is finite, you
| need to choose a basis of V to find an isomorphism with the
| bidual. From a math point of view, that's just no good.
| senderista wrote:
| No, the isomorphism between V and V** (for finite-dimensional
| V) is canonical. The canonical isomorphism T:V->V** is easy
| to construct: map a vector v in V to the element of V** which
| takes an element w from V* and applies it to v: T(v)(w) =
| w(v).
| g15jv2dp wrote:
| GP is giving you an element of V**. You want to turn it
| into a vector. To do that, please make the inverse
| isomorphism explicit without using a basis. I'll wait...
| xanderlewis wrote:
| You said even when dim V is finite you need a basis to
| find an isomorphism with V**. But that's not true. You're
| right if you mean V*, but not the bidual.
| senderista wrote:
| Why? All I need to prove is that I have a canonical
| linear map going in one direction, and that this map is a
| bijection. (Since I already constructed such a bijection,
| I could just answer "for any z in V**, take the unique
| element v in V such that for all w in V*, w(v) = z(w)".)
| Do you disagree that I provided such a map? You are
| correct that V** is not isomorphic to V when V is
| infinite-dimensional, but your statement that "when dim V
| is finite, you need to choose a basis of V to find an
| isomorphism with the bidual" is incorrect. This is
| elementary textbook stuff (e.g. first chapter of Wald's
| GR book), so I won't argue further.
| btilly wrote:
| You are right about infinite dimensions, wrong about finite
| dimensions. V and V* are naturally isomorphic for finite
| dimensions.
|
| In finite dimensions, V and V* are isomorphic, but not
| naturally so. The isomorphism requires additional
| information. You can specify a basis to get the isomorphism,
| but many bases will give the same isomorphism. The exact
| amount of information that you need is a metric. If you have
| a metric, then every orthonormal basis in that metric will
| give the same isomorphism.
| ogogmad wrote:
| You need to correct the 2nd sentence to say that V and V**
| are naturally isomorphic. V and V* are only _unnaturally_
| isomorphic. All of this holds only in finite dimensions, of
| course.
| g15jv2dp wrote:
| You have a typo in your first line, and I answered a
| sibling comment about that. Metrics are irrelevant to the
| discussion (and I presume you wanted to write "norm"
| instead of "metric").
| btilly wrote:
| Yes, I had a typo.
|
| As for why I said metric, see
| https://en.wikipedia.org/wiki/Metric_tensor. Which is
| technically a concept from differential geometry rather
| than linear algebra. But then again, tensors are
| literally the topic that started this. And it is only in
| differential geometry that I've ever cared about mapping
| from V to V*.
| g15jv2dp wrote:
| There are no manifolds at all in this discussion... What
| are you even talking about? Just stringing words you
| vaguely know together??
| btilly wrote:
| This comment is in a discussion about an article titled,
| _Tensors, the geometric tool that solved Einstein 's
| relativity problem_. Therefore, "tensors are literally
| the topic that started this discussion."
|
| Hopefully that's a hint that you should attempt to figure
| out what someone might be talking about before going to
| schoolyard insults.
| mr_mitm wrote:
| Yes, very simple, except that when physicists say "tensor",
| they mean tensor fields, on smooth, curved manifolds, in at
| least four dimensions, often with a Lorentz metric. Things stop
| being simple quickly.
| adrian_b wrote:
| It does not matter on what set a tensor field is defined.
|
| A tensor field is not a tensor, but the value of a tensor
| field at any point is a tensor, which satisfies the
| definition given above, exactly like the value of a vector
| field at any point is a vector.
|
| The "fields" are just functions.
|
| There are physics books that do not give the easier to
| understand definition given above, but they give an
| equivalent, but more obscure, definition of a tensor, by
| giving the transformation rules for its contravariant
| components and for its covariant components at a change of
| the reference system.
|
| The word "tensor" with the current meaning has been used for
| the first time by Einstein and he has not given any
| explanation for this word choice. The theory of tensors that
| Einstein has learned had not used the word "tensor".
|
| Before Einstein, the word "tensor" (coined by Hamilton) was
| used in physics with the meaning of "symmetrical matrix",
| because the geometric (affine) transformation of a body that
| is determined by the multiplication with a symmetric matrix
| extends (or compresses) the body towards certain directions
| (the axes that correspond to a rotation that would
| diagonalize the symmetric matrix). The word "tensor" in the
| old sense was applied only to what is called now "symmetric
| tensor of the second order" (which remains the most important
| kind of the tensors that are neither vectors nor scalars).
| prof-dr-ir wrote:
| > The "fields" are just functions.
|
| I think this is far too simplistic, for one because the
| values of this putative function depend on the chosen
| coordinate system.
|
| So I completely agree with the comment you are replying to:
| when a physicist says "tensor" they really mean a "tensor
| field" and the definition of the latter is quite a bit more
| involved than just specifying a multilinear map at each
| point of a manifold.
| mr_mitm wrote:
| Plus, as if tensor fields on Lorentz manifolds weren't
| already complicated enough, physicists aren't happy until
| they can write down some differential equations. So not
| only are you doing calculus, you're doing it on curved
| manifolds, with complicated tensor objects, in the
| context of partial differential equations, which - in the
| case of general relativity - are non-linear. It's okay to
| admit that all of this is a bit hard. Hell, as the
| article points out, Einstein himself had trouble
| understanding them.
| adrian_b wrote:
| The values of the "putative function" do not depend on
| the chosen coordinate system.
|
| This is the essence of notions like scalar, vector,
| tensor, that they do not depend on the chosen coordinate
| system.
|
| Only their numeric representations associated with a
| chosen coordinate system do depend on that system.
|
| If you compute some arbitrary functions of the numeric
| components of a tensor in a certain coordinate system, in
| most cases the array of numbers that composes the result
| will not be a tensor, precisely because the result will
| really be different in any other coordinate system, while
| a tensor must be invariant.
|
| All physical laws are formulated only using various kinds
| of tensors, including vectors and scalars, precisely
| because they must be invariant at the choice of the
| coordinate system.
| abdullahkhalids wrote:
| > There are physics books that do not give the easier to
| understand definition given above, but they give an
| equivalent, but more obscure, definition of a tensor, by
| giving the transformation rules for its contravariant
| components and for its covariant components at a change of
| the reference system.
|
| The definition of a tensor as linear maps, while simple to
| understand, has no content that is useful for doing
| physics. To do any physics, or for that matter, any
| geometry with tensors, you need to define the notion of
| covariance and contravariance.
|
| Besides, the starting with the latter notions allow you to
| define tensors more naturally. You start with trying to
| understand how geometric objects transform under coordinate
| transformations, and you slowly but surely end up with
| tensors.
| tacomonstrous wrote:
| Covariance and contravariance are mathematical notions,
| and have to do with whether each multiplicative
| constituent or the tensor is a vector in your given
| vector space (covariant) or a linear functional on this
| space (contravariant). There is no inherent physical
| meaning to either concept.
| adrian_b wrote:
| No, the definition of a tensor as a linear map, is the
| only definition that is useful for doing physics.
|
| All the physical quantities that are defined to be
| tensors are quantities used to transform either vectors
| into other vectors or tensors of higher orders into other
| tensors of higher orders (for instance the transformation
| between the electric field vector and the electric
| polarization vector).
|
| Therefore all such physical quantities are used to
| describe multilinear functions, either in linear
| anisotropic media, or in non-linear anisotropic media,
| but in the latter case they are applicable only to
| relations between small differences, where linear
| approximations may be used.
|
| The multilinear function is the physical concept that is
| independent of the coordinate system. The concrete
| computations with a tensor a.k.a. multilinear function
| may need the computation of contravariant and/or
| covariant components in a particular coordinate system
| and the use of their transformation rules. On the other
| hand, the abstract formulation of the physical laws does
| not need such details, but only the high-level
| definitions using multi-linear functions, and it is
| independent of any choice for the coordinate system.
|
| There is a unique multilinear function a.k.a. tensor, but
| it can be expressed by an infinity of different arrays of
| numbers, corresponding to various combinations of
| contravariant or covariant components, in various
| coordinate systems. Their transformation rules can be
| determined by the condition that they must represent the
| same function. In the books that do not explain this, the
| rules appear to be magic and they do not allow an
| understanding of why the rules are these and not others.
| cfgauss2718 wrote:
| Very well, let's just agree that in physics (r,s) tensors
| usually refer to sections of the tensor product of some
| fixed number of copies of the tangent bundle (r copies) and
| cotangent bundle (s copies) of a smooth manifold (almost
| always pseudo-Riemannian) and leave it there. Elementary!
| DemocracyFTW2 wrote:
| A monad is just a monoid in the category of endofunctors,
| what's the problem?
| bratwurst3000 wrote:
| you could also call it a matrix to modify a matrix in a vector
| space. not 100% correct. xD
| bjourne wrote:
| Well, I can write a definition that is both easier to
| understand and shorter than yours:
|
| A tensor is a multi-dimensional array.
|
| :)
| eigenket wrote:
| This only works in finite dimensions, which for
| mathematicians excludes pretty much all of the interesting
| cases.
| ithinkso wrote:
| This is actually a harmful definition, both (1,1) and (0,2)
| tensors can be written as a matrix but they are _very_
| different. It 's like calling vector an array but vectors
| require vector space and arrays are just arrays. It doesn't
| help that std::vector is very common in CS but 'pushing back'
| to a mathematical vector just doesn't make any sense
| ndriscoll wrote:
| I think the people who find this definition to be mysterious
| are really looking for (borrowing from Ravi Vakil[0]) "why is a
| tensor" rather than "what is a tensor". In that case, a better
| answer IMO is that it's the "most generic" way to multiply
| vectors that's compatible with the linear structure: "v times
| w" is _defined to be the symbol_ "v[?]w". There is no meaning
| to that symbol.
|
| But these things are vectors, so you could write e.g. v =
| a[?]x+b[?]y, and then you want e.g. (a[?]x+b[?]y)[?]w = ax[?]w
| + by[?]w, and so on.
|
| So in some sense, the quotient space construction[1] gives a
| better "why". It says
|
| * I want to multiply vectors in V and W. So let's just start by
| writing down that "v times w" is the symbol "v[?]w", and I want
| to have a vector space, so take the vector space generated by
| all of these symbols.
|
| * But I also want that (v_1+v_2)[?]w = v_1[?]w + v_2[?]w
|
| * And I also want that v[?](w_1+w_2) = v[?]w_1 + v[?]w_2
|
| * And I also want that (sv)[?]w = s(v[?]w) = v[?](sw)
|
| And that's it. However you want to concretely define tensors,
| they ought to be "a way to multiply vectors that follows those
| rules". Quotienting is a generic technique to say "start with
| this object, and add this additional rule while keeping all of
| the others".
|
| Another way to say this is that the tensor algebra is the "free
| associative algebra": it's a way to multiply vectors where the
| only rules you have to reduce expressions are the ones you
| needed to have.
|
| [0] https://www.youtube.com/live/mqt1f8owKrU?t=500
|
| [1]
| https://en.wikipedia.org/wiki/Tensor_product#As_a_quotient_s...
| max_likelihood wrote:
| I've always thought the use of "Tensor" in the "TensorFlow"
| library is a misnomer. I'm not too familiar with ML/theory, is
| there a deeper geometric meaning to the multi-dimensional array
| of numbers we are multiplying or is "MatrixFlow" a more
| appropriate name?
| itishappy wrote:
| The tensors in tensorflow are often higher dimensional. Is a 3d
| block of numbers (say 1920x1080x3) still a matrix? I would
| argue it's not. Are there transformation rules for matrices?
|
| You're totally correct that the tensors in tensorflow do drop
| the geometric meaning, but there's precedence there from how CS
| vs math folk use vectors.
| andrewla wrote:
| Matrices are strictly two-dimensional arrays (together with
| some other properties, but for a computer scientist that's it).
| Tensors are the generalization to higher dimensional arrays.
| MathMonkeyMan wrote:
| The joke I learned in a Physics course is "a vector is
| something that transforms like a vector," and "a tensor is
| something that transforms like a tensor." It's true, though.
|
| The physicist's tensor is a matrix of functions of coordinates
| that transform in a prescribed way when the coordinates are
| transformed. It's a particular application of the chain rule
| from calculus.
|
| I don't know why the word "tensor" is used in other contexts.
| Google says that the etymology of the word is:
|
| > early 18th century: modern Latin, from Latin tendere 'to
| stretch'.
|
| So maybe the different senses of the word share the analogy of
| scaling matrices.
| ogogmad wrote:
| The mathematical definition is 99% equivalent to the physical
| one. I find that the physical one helps to motivate the
| mathematical one by illustrating the numerical difference
| between the basis-change transformation for (1,0)- and
| (0,1)-tensors. The mathematical one is then simpler and more
| conceptual once you've understood that motivation. The
| concept of a tensor really belongs to linear algebra, but
| occurs mostly in differential geometry.
|
| There is still a "1% difference" in meaning though. This
| difference allows a physicist to say "the Christoffel symbols
| are _not_ a tensor ", while a mathematician would say this is
| a conflation of terms.
|
| TensorFlow's terminology is based on the rule of thumb that a
| "vector" is really a 1D array (think column vector), a
| "matrix" is really a 2D array, and a "tensor" is then an nD
| array. That's it. This is offensive to physicists especially,
| but -\\_(tsu)_/-
| bjourne wrote:
| The problem with the physicist's definition is that the
| larger the N the less the geometrical interpretation makes
| sense. For 1, 2, and even 3-dimensional tensors there is
| some connection to geometry, but eventually it loses all
| meaning. Physicist has to give up and "admit" that an
| N-dimensional tensor really just is a collection of
| N-1-dimensional tensors.
| blt wrote:
| There is no geometric meaning. It's a really bad name.
| dannymi wrote:
| In the first example on
| https://www.tensorflow.org/api_docs/python/tf/math/multiply you
| can see that they use the Hadamard product (not the matrix
| product): x = tf.constant(([1, 2, 3, 4]))
| tf.math.multiply(x, x) <tf.Tensor: shape=(4,),
| dtype=..., numpy=array([ 1, 4, 9, 16], dtype=int32)>
|
| I could stop right here since it's a counterexample to x being
| a matrix (with a matrix product defined on it; P.S. try
| tf.matmul(x, x)--it will fail; there's no .transpose either).
| But that's only technically correct :)
|
| So let's look at tensorflow some more:
|
| The tensorflow tensors should transform like vectors would
| under change of coordinate system.
|
| In order to see that, let's do a change of coordinate system.
| To summarize the stuff below: If L1 and W12 are indeed tensors,
| it should be true that A L1 W12 A^-1 = L1 W12.
|
| Try it (in tensorflow) and see whether the new tensor obeys the
| tensor laws after the transformation. Interpret the changes to
| the nodes as covariant and the changes to the weights as
| contravariant: import tensorflow as tf
| # Initial outputs of one layer of nodes in your neural network
| L1 = tf.constant([2.5, 4, 1.2], dtype=tf.float32) # Our
| evil transformation matrix (coordinate system change) A
| = tf.constant([[2, 0, 0], [0, 1, 0], [0, 0, 0.2]],
| dtype=tf.float32) # Weights (no particular values;
| "random") W12 = tf.constant( [[-1, 0.4,
| 1.5], [0.8, 0.5, 0.75], [0.2, -0.3,
| 1]], dtype=tf.float32 ) # Covariant tensor
| nature; varying with the nodes L1_covariant =
| tf.matmul(A, tf.reshape(L1, [3, 1])) A_inverse =
| tf.linalg.inv(A) # Contravariant tensor nature; varying
| against the nodes W12_contravariant = tf.matmul(W12,
| A_inverse) # Now derive the inputs for the next layer
| using the transformed node outputs and weights L2 =
| tf.matmul(W12_contravariant, L1_covariant) # Compare to
| the direct way L2s = tf.matmul(W12, tf.reshape(L1, [3,
| 1])) #assert L2 == L2s
|
| A tensor (like a vector) is actually a very low-level object
| from the standpoint of linear algebra. It's not hard at all to
| make something a tensor. Think of it like geometric "assembly
| language".
|
| In comparison, a matrix is rank 2 (and not all matrices
| represent tensors). That's it. No rank 3, rank 4, rank 1 (!!).
| So what does a matrix help you, really?
|
| If you mean that the operations in tensorflow (and numpy before
| it) aren't beautiful or natural, I agree. It still works,
| though. If you want to stick to ascii and have no indices on
| names, you can't do much better (otherwise, use
| Cadabra[1]--which is great). For example, it was really
| difficult to write the stuff above without using indices and
| it's really not beautiful this way :(
|
| More detail on https://medium.com/@quantumsteinke/whats-the-
| difference-betw...
|
| See also http://singhal.info/ieee2001.pdf for a primer on
| information science, including its references, for vector
| spaces with an inner product that are usually used in ML. The
| latter are definitely geometry.
|
| [1] https://cadabra.science/ (also in mogan or texmacs) -
| Einstein field equations also work there and are beautiful
| andrewla wrote:
| In TensorFlow the tf.matmul function or the @ operator
| perform matrix multiplication. Element-wise multiplication
| ends up being useful for a lot of paralellizable computation
| but should not be confused with matrix multiplication.
| twothreeone wrote:
| I agree. Just like NumPy's Einsum. "Multi-Array Flow" doesn't
| sound sexy and associating your project with a renowned
| physicist's name gives your project that "we solve big science
| problems" vibe by association. Very pretentious, very
| predictable, and very cringe.
| adrian_b wrote:
| Since the beginning of computer technology, "array" is the term
| that has been used for any multi-dimensional array, with
| "vectors" and "matrices" being special kinds of arrays. An
| exception was COBOL, which had a completely different
| terminology in comparison with the other programming languages
| of that time. Among the long list of differences between COBOL
| and the rest were e.g. "class" instead of "type" and "table"
| instead of "array". Some of the COBOL terminology has been
| inherited by languages like SQL or Simula 67 (hence the use of
| "class" in OOP languages).
|
| A "tensor", as used in mathematics in physics is not any array,
| but it is a special kind of array, which is associated with a
| certain coordinate system and which is transformed by special
| rules whenever the coordinate system is changed.
|
| The "tensor" in TensorFlow is a fancy name for what should be
| called just "array". When an array is bidimensional, "matrix"
| is an appropriate name for it.
| openrisk wrote:
| > Talk to a computer scientist, and they might tell you that a
| tensor is an array of numbers that stores important data
|
| The conflicting definitions of tensors have precedent in lower
| dimensions: _vectors_ were already being used in computer science
| to mean something different than in mathematics / physics, long
| before the current tensormania.
|
| Its not clear if that ambiguity will ever be a practical problem
| though. For as long as such structures are containers of
| numerical data with no implied transformation properties we are
| really talking about two different universes.
|
| Things might get interesting though in the overlap between
| information technology and geometry [1] :-)
|
| [1] https://en.wikipedia.org/wiki/Information_geometry
| nyrikki wrote:
| I would argue that today, geometric algebra/Clifford calculus and
| space time algebra are more intuitive and useful.
|
| Gibbs/Heavysides vectors were more popular at the time.
|
| At least for me.
| eigenket wrote:
| Nah, at least in the context of General Relativity tensors and
| tensor fields are everything. Geometric algebra doesn't add
| much.
| bollu wrote:
| I've written about [this explanation of tensors](https://pixel-
| druid.com/articles/tensor-is-a-thing-that-tran...) before, and it
| seems worthwhile to write it down again:
|
| There are two ways of using linear maps in the context of
| physics. One is as a thing that acts on the space . The other is
| a thing that acts on the coordinates . So when we talk about
| transformations in tensor analysis, we're talking about
| coordinate transformatios , not space transformations . Suppose I
| implement a double ended queue using two pointers:
|
| ``` struct Queue {int _memory,_ start, _end; } void
| queue_init(int size) { memory = malloc(sizeof(int)_ size); start
| = end = memory + (size - 1) / 2; } void queue_push_start(int x)
| { _start = x; start--; } void queue_push_end(int x) { end++;_ end
| = x; } int queue_head() { return _start; } int queue_tail() {
| return_ end; } void queue_deque_head() { start++; } void
| queue_deque_tail() { tail--; } ```
|
| See that the state of the queue is technically three numbers, {
| memory, start, end } (Pointers are just numbers after all). But
| this is coordinate dependent , as start and end are relative to
| the location of memory. Now suppose I have a procedure to
| reallocate the queue size:
|
| ``` void queue_realloc(Queue _q, int new_size) { int start_offset
| = q- >memory - q->start; int end_offset = q->memory - q->end; int
| _oldmem = q->memory; q->memory = realloc(q->memory, new_size);
| memcpy(q->memory, oldmem + q->start, sizeof(int) * (end_offset -
| start_offset); q->start = q->memory + start_offset; q->end =
| q->memory - end_offset; }
|
| ```
|
| Notice that when I do this, the values of start and end can be
| completely different! However, see that the length of the queue,
| given by (end - start) is invariant : It hasn't changed!
|
| ---
|
| In the exact same way, a "tensor" is a collection of numbers that
| describes something physical with respect to a particular
| coordinate system (the pointers start and end with respect to the
| memory coordinate system). "tensor calculus" is a bunch of rules
| that tell you how the numbers change when one changes coordinate
| systems (ie, how the pointers start and end change when the
| pointer memory changes). Some quantities that are computed from
| tensors are "physical", like the length of the queue, as they are
| invariant under transformations. Tensor calculus gives a
| principled way to make sure that the final answers we calculate
| are "invariant" / "physical" / "real". The actual locations of
| start and end don't matter, as (end - start) will always be the
| length of the list!
|
| ---
|
| Physicists (and people who write memory allocators) need such
| elaborate tracking, to keep track of what is "real" and what is
| "coordinate dependent", since a lot of physics involves crazy
| coordinate systems , and having ways to know what things are real
| and what are artefacts of one's coordinate system is invaluable.
| For a real example, consider the case of singularities of the
| Schwarzschild solution to GR, where we initially thought there
| were two singularities, but it later turned out there was only
| one "real" singularity, and the other singularity was due to a
| poor choice of coordinate system:
|
| Although there was general consensus that the singularity at r =
| 0 was a 'genuine' physical singularity, the nature of the
| singularity at r = rs remained unclear. In 1921 Paul Painleve and
| in 1922 Allvar Gullstrand independently produced a metric, a
| spherically symmetric solution of Einstein's equations, which we
| now know is coordinate transformation of the Schwarzschild
| metric, Gullstrand-Painleve coordinates, in which there was no
| singularity at r = rs. They, however, did not recognize that
| their solutions were just coordinate transform
| ijidak wrote:
| Here is a video series on tensors I've enjoyed:
| https://youtube.com/playlist?list=PLJHszsWbB6hrkmmq57lX8BV-o...
|
| And this series by Dialect:
| https://youtube.com/playlist?list=PL__fY7tXwodmfntSAAyBDxZ4_...
| mvaliente2001 wrote:
| The idea of tensors as "a matrix of numbers" or the example of a
| cube with vectors on every face never clicked for me. It was this
| (NASA paper)[https://www.grc.nasa.gov/www/k-12/Numbers/Math/docum
| ents/Ten...] what finally brought me clarity. The main idea, as
| others already commented, is that a tensor or rank n is a
| function that can be applied up to n vector, reducing its rank by
| one for each vector it consumes.
| cryptonector wrote:
| > a tensor or rank n is a function that can be applied up to n
| vector
|
| There seems to be a grammar problem here.
| qsdf38100 wrote:
| In your cube example you are using the word "vector" to refer
| to faces of the cube. Did you mean matrix?
|
| My understanding is that the cube is a rank 3 tensor, the faces
| (or rather slices) of the cube are rank 2 tensors (aka
| matrices), and the edges (slices) of the matrices are rank 1
| tensors (aka vectors).
___________________________________________________________________
(page generated 2024-08-12 23:01 UTC)