[HN Gopher] Tensors, the geometric tool that solved Einstein's r...
       ___________________________________________________________________
        
       Tensors, the geometric tool that solved Einstein's relativity
       problem
        
       Author : Luc
       Score  : 70 points
       Date   : 2024-08-12 15:11 UTC (7 hours ago)
        
 (HTM) web link (www.quantamagazine.org)
 (TXT) w3m dump (www.quantamagazine.org)
        
       | senderista wrote:
       | If you have any linear algebra background, then the definition of
       | a tensor is straightforward: given a vector space _V_ over a
       | field _K_ (in physics, _K_ = _R_ or _C_ ), a tensor _T_ is a
       | multilinear (i.e. linear in each argument) function from vectors
       | and dual vectors in _V_ to numbers in _K_. That 's it! A type
       | _(p, q)_ tensor _T_ takes _p_ vectors and _q_ dual vectors as
       | arguments ( _p+q_ is often called the _rank_ of _T_ but is
       | ambiguous compared to the type).
       | 
       | (If you're unfamiliar with the definition of dual vector, it's
       | even simpler: it's just a linear function from _V_ to _K_.)
        
         | will-burner wrote:
         | The definition may be simple, but it's not very concrete and
         | I'd argue that makes it not strait forward. While examples of
         | vector spaces can be very concrete (think R, R^2, R^30), I
         | struggle to think of a concrete example of a multilinear
         | function from vectors and dual vectors in V to numbers in K. On
         | top of that when working with tensors, you don't usually use
         | the definition os a multilinear function at least as far as I
         | remember.
        
           | tel wrote:
           | Not really to push back as I do agree that this is a bit
           | trickier to get an intuition for than the OP suggests, but
           | the most trivial concrete example of a (1, 1) tensor would
           | just be the evaluation function (v, f) |-> f(v), which, given
           | a metric, corresponds to the inner product.
        
           | cshimmin wrote:
           | A simple example of a multilinear function is the inner
           | (a.k.a dot) product <a, b>: it takes a vector (b), and a dual
           | vector (a^T), and returns a number. In tensor notation it's
           | typically written d_ij.
           | 
           | It's multilinear because it's linear in each of its arguments
           | separately: <ca, b> = c<a,b> and <a, cb> = c<a,b>.
           | 
           | Another simple but less obvious example is a rotation
           | (orthogonal) matrix. It takes a vector as an input, and
           | returns a vector. But a vector itself can be thought of as a
           | linear function that takes a dual vector and returns a number
           | (via the inner product, above!). So, applying the rotation
           | matrix to a vector is a sort of "currying" on the multilinear
           | map, while the matrix alone can be considered a function that
           | takes a vector and a dual vector, and returns a number.
           | 
           | In functional notation, you can consider your rotation matrix
           | to be a function (V x V*) -> K, which can in turn be
           | considered a function V -> (V* -> K), where V* is the dual
           | space of V.
        
             | senderista wrote:
             | I think you're describing the evaluation map T(v, w) =
             | w(v), which has type (1,1), rather than the inner product,
             | which has type (2,0). The inner product lets you "raise and
             | lower indices" (i.e. convert between vectors and dual
             | vectors), so you can basically pretend that it is the
             | evaluation map.
        
           | klodolph wrote:
           | I think part of this is "if you have a linear algebra
           | background". There are a few different explanations of
           | tensors, and different explanations make sense for different
           | people.
        
           | adrian_b wrote:
           | In physics, the first and even now the most important
           | application of multilinear functions, a.k.a. tensors, is in
           | the properties of anisotropic solids.
           | 
           | A solid can be anisotropic, i.e. with properties that depend
           | on the direction, either because it is crystalline or because
           | there are certain external influences, like a force or an
           | electric field or a magnetic field that are applied in a
           | certain direction.
           | 
           | In (linear) anisotropic solids, a vector property that
           | depends on another vector property is no longer collinear
           | with the source, but it has another direction, so the output
           | vector is a bilinear function of the input vector and of the
           | crystal orientation, i.e. it is obtained by the
           | multiplication with a matrix. This happens for various
           | mechanical, optical, electric or magnetic properties.
           | 
           | When there are more complex effects, which connect properties
           | from different domains, like piezoelectricity, which connects
           | electric properties with mechanical properties, then the
           | matrices that describe vector transformations, a.k.a. tensors
           | of the second order, may depend on other such tensors of the
           | second order, so the corresponding dependence is described by
           | a tensor of the fourth order.
           | 
           | So the tensors really appear in physics as multilinear
           | functions, which compute the answers to questions like "if I
           | apply a voltage on the electrodes deposited on a crystal in
           | this positions, which will be the direction and magnitude of
           | the displacements of certain parts of the crystal". While in
           | isotropic media you can have relationships between vectors
           | that are described by scalars and relationships between
           | scalars that are also described by scalars, the corresponding
           | relationships for anisotropic media become much more
           | complicated and the simple scalars are replaced everywhere by
           | tensors of various orders.
           | 
           | What in an isotropic medium is a simple proportionality
           | becomes a multilinear function in an anisotropic medium.
           | 
           | The distinction between vectors and dual vectors appears only
           | when the coordinate system does not use orthogonal axes,
           | which makes all computations much more complicated.
           | 
           | The anisotropic solids have become extremely important in
           | modern technology. All the high-performance semiconductor
           | devices are made with anisotropic semiconductor crystals.
        
         | AnotherGoodName wrote:
         | For those without a strong math background but more of a
         | programmers background;
         | 
         | You know the matrices you work with in 2D or 3D graphics
         | environments that you can apply to vectors or even other
         | matrices to more easily transform (rotate, translate, scale)?
         | 
         | Well tensors are the generalisation of this concept. If you've
         | noticed 2D games transformation matrices seem similar (although
         | much simpler) to 3D games transformation mateices you've
         | probably wondered what it'd look like for a 4D spacetime or
         | even more complex scenarios. Well you've now started thinking
         | about tensors.
        
           | andrewla wrote:
           | Slight disagree here -- matrices are enough for
           | transformations in 2, 3, 4, and 100 dimensions. Tensors are
           | not arrays with more rows and columns; they are higher
           | dimensional objects -- more indices, not greater range of
           | indices.
        
             | SkyBelow wrote:
             | Is a tensor higher dimension, or is it the generalized form
             | of the structure encompassing all of it (individual numbers
             | (0 dimensions), vectors (1 dimension), matrixes (2
             | dimensions), and so on)? Kind of like how an n-sphere
             | describes circles and spheres for n equal to 2 or 3.
        
               | andrewla wrote:
               | The problem is that I'm using dimensional in two
               | different senses. A vector with N elements, for example,
               | is a one-dimensional array of numbers operating on N
               | dimensional spaces, and so on for matrices and tensors.
               | 
               | If you're doing graphics programming you're operating in
               | three and four dimensional spaces mostly (four
               | dimensional being projective spaces), because that's the
               | space you're trying to render in two dimensions. But
               | you'll rarely need anything higher than a matrix (a two-
               | dimensional data structure) for operations on that space.
               | 
               | If you're doing physics you're operating in three, four,
               | and infinite dimensional spaces, mostly. And you'll
               | routinely use higher data structures -- even things like
               | moment of inertia for rigid bodies can't really be
               | described without rank 3 tensors (a three-dimensional
               | data structure).
               | 
               | In statistics and machine learning, you're operating in
               | very high dimensional spaces, and will find yourself
               | using non-square tensors especially (in the other areas
               | everything will be square). The data structures will
               | generally be high dimensional as well, but usually just a
               | function of model complexity; so maybe 4 or 5 dimensional
               | data structures.
        
               | enugu wrote:
               | Any given tensor has a type (p,q). Say d=p+q. Then d is
               | the 'dimension' of the tensor in the sense that you would
               | need nxnx...n=n^d numbers (think of an nxnx...n array) to
               | describe the tensor where n is the dimension of the
               | underlying vector space.
               | 
               | > or is it the generalized form of the structure
               | encompassing all of it...Kind of like how an n-sphere
               | 
               | If you are asking whether there are examples of tensors
               | parametrized by an integer d, you can cook up examples -
               | like the (d,0) tensor whose input is a sequence of d
               | vectors and just adds up all the components with respect
               | to some basis in each slot and then adds this number
               | across all the d slots.
               | 
               | But just like a n-spere is a special example, of a
               | polynomial in n variables, the above tensor is a specific
               | example - it is the tensor where all the components are 1
               | in the higher dimensional array.
               | 
               | Sometimes, one considers the algebra of tensors across
               | all dimensions like the symmetric algebra or exterior
               | algebra simultaneously (where there is multiplication
               | operation between tensors of different dimensions), but
               | that might not be what you were asking about.
        
               | thechao wrote:
               | > (individual numbers (0 dimensions), vectors (1
               | dimension), matrixes (2 dimensions), and so on)
               | 
               | You're using the word "dimension" in two (distinct) ways.
               | Instead, use the word "rank":
               | 
               | > (individual numbers (0 rank), vectors (1 rank),
               | matrixes (2 rank), and so on)
               | 
               | Now, we can talk about a 4-dimensional rank-1 tensor,
               | e.g., a 4-element vector.
               | 
               | Now, think about a 4x4 matrix: if we multiply the matrix
               | by a 4-vector, we get a 4-vector out: in some ways, the
               | multiplication has "eaten" one of the _ranks_ of the
               | matrix; but, the dimension of the resulting object is the
               | same. If we had a 3x2 matrix, and we multiplied it by a
               | 3-vector, then both the rank has changed (from 2 to 1)
               | _and_ the dimension has changed (from 3 to 2).
               | 
               | A tensor has any number of rank.
               | 
               | More importantly, the ranks of a tensor come in two
               | "flavors": a vector and a one-form. The concepts are
               | pretty darn general, but one way to get a feel for how
               | they're related is that the transpose of a vector can be
               | its dual. This gets into things like pre- and post-
               | multiplication; or, whether we 'covary' or 'contravary'
               | with respect to the tensor.
               | 
               | Frankly, tensor products are a beast to deal with,
               | mechanically, so the literature mostly deals with them as
               | opaque objects. The modern tensor software libraries and
               | high performance computing has seen a sea-change in the
               | use of GR.
        
           | mitthrowaway2 wrote:
           | Is there a difference between a 4x4 matrix and a 4x4 tensor?
        
             | AnotherGoodName wrote:
             | A matrix is a subset of a tensor and the 4x4 matrix is
             | absolutely a tensor. Tensors can be way more complex and do
             | more but the 4x4 matrix you use in 3D operations is a great
             | starting point.
        
             | tel wrote:
             | In short, tensors generalize matrices. While we can
             | probably guess accurately what "4x4 matrix" means, "4x4
             | tensor" is missing some information to really nail down
             | what it means.
             | 
             | Interestingly, that extra information helps us to
             | differentiate between the same matrix being used in
             | different "roles". For instance, if you have a 4x4 matrix
             | A, you might think of it like a linear transformation.
             | Given x: V = R^4 and y = Ax, then y is another vector in V.
             | Alternatively, you might think of it like a quadratic form.
             | Given two vectors x, y: V, the value xAy is a real number.
             | 
             | In linear algebra, we like to represent both of those
             | operations as a matrix. On the other hand, those are
             | _different_ tensors. The first would be a rank-(1,1)
             | tensor, the second a rank-(2, 0) tensor.
             | 
             | Ultimately, we might write down both of those tensors with
             | the same 4x4 array of 16 numbers that we use to represent
             | 4x4 matrices, but in the sort of math where all these
             | subtle differences start to really matter there are
             | additional rules constraining how rank-(1, 1) tensors are
             | distinct from rank-(2, 0) tensors.
        
           | pyinstallwoes wrote:
           | Is there a similar "gimbal lock" problem that people ran
           | into?
        
           | jesuslop wrote:
           | To add a bit, kudos to root-parent boil-down. Programmers
           | have already the good representation, and call it
           | n-dimensional array, that being a list of lists of lists ...
           | (repeat n times) ... of lists of numbers. The only nuisance
           | is that what programmers call dimension, math people call it
           | rank. It is the sizes of those nested lists what math people
           | call dimensions. It's all set up for a comedy of errors. Also
           | in math the rank is split to make explicit how much of the
           | rank-many arguments are vectors and how many are dual
           | vectors. You'd say something like this is a rank 7 tensor, 3
           | times covariant (3 vector arguments) and 4 times
           | contravariant (4 dual vector arguments) summing 7 total
           | arguments. I'm assuming a fixed base, so root-parent map
           | determines a number array.
        
           | paulpauper wrote:
           | this is still not it. 4d would simply be a 4x4 matrix instead
           | of 3x3
           | 
           | tensors are something which no one has been able to fully or
           | adequately describe. I think you simply have to treat them as
           | a set of operations and not try to map or force them unto
           | existing concepts like linear algebra or matrices. they are
           | similar but otherwise something completely different.
        
         | g15jv2dp wrote:
         | That's incorrect if V is infinite-dimensional. A (0,1)-tensor
         | is just supposed to be an element of V but with your definition
         | you get an element of the bidual of V. Which is not isomorphic
         | to V when dim V is infinite. And even when dim V is finite, you
         | need to choose a basis of V to find an isomorphism with the
         | bidual. From a math point of view, that's just no good.
        
           | senderista wrote:
           | No, the isomorphism between V and V** (for finite-dimensional
           | V) is canonical. The canonical isomorphism T:V->V** is easy
           | to construct: map a vector v in V to the element of V** which
           | takes an element w from V* and applies it to v: T(v)(w) =
           | w(v).
        
             | g15jv2dp wrote:
             | GP is giving you an element of V**. You want to turn it
             | into a vector. To do that, please make the inverse
             | isomorphism explicit without using a basis. I'll wait...
        
               | xanderlewis wrote:
               | You said even when dim V is finite you need a basis to
               | find an isomorphism with V**. But that's not true. You're
               | right if you mean V*, but not the bidual.
        
               | senderista wrote:
               | Why? All I need to prove is that I have a canonical
               | linear map going in one direction, and that this map is a
               | bijection. (Since I already constructed such a bijection,
               | I could just answer "for any z in V**, take the unique
               | element v in V such that for all w in V*, w(v) = z(w)".)
               | Do you disagree that I provided such a map? You are
               | correct that V** is not isomorphic to V when V is
               | infinite-dimensional, but your statement that "when dim V
               | is finite, you need to choose a basis of V to find an
               | isomorphism with the bidual" is incorrect. This is
               | elementary textbook stuff (e.g. first chapter of Wald's
               | GR book), so I won't argue further.
        
           | btilly wrote:
           | You are right about infinite dimensions, wrong about finite
           | dimensions. V and V* are naturally isomorphic for finite
           | dimensions.
           | 
           | In finite dimensions, V and V* are isomorphic, but not
           | naturally so. The isomorphism requires additional
           | information. You can specify a basis to get the isomorphism,
           | but many bases will give the same isomorphism. The exact
           | amount of information that you need is a metric. If you have
           | a metric, then every orthonormal basis in that metric will
           | give the same isomorphism.
        
             | ogogmad wrote:
             | You need to correct the 2nd sentence to say that V and V**
             | are naturally isomorphic. V and V* are only _unnaturally_
             | isomorphic. All of this holds only in finite dimensions, of
             | course.
        
             | g15jv2dp wrote:
             | You have a typo in your first line, and I answered a
             | sibling comment about that. Metrics are irrelevant to the
             | discussion (and I presume you wanted to write "norm"
             | instead of "metric").
        
               | btilly wrote:
               | Yes, I had a typo.
               | 
               | As for why I said metric, see
               | https://en.wikipedia.org/wiki/Metric_tensor. Which is
               | technically a concept from differential geometry rather
               | than linear algebra. But then again, tensors are
               | literally the topic that started this. And it is only in
               | differential geometry that I've ever cared about mapping
               | from V to V*.
        
               | g15jv2dp wrote:
               | There are no manifolds at all in this discussion... What
               | are you even talking about? Just stringing words you
               | vaguely know together??
        
               | btilly wrote:
               | This comment is in a discussion about an article titled,
               | _Tensors, the geometric tool that solved Einstein 's
               | relativity problem_. Therefore, "tensors are literally
               | the topic that started this discussion."
               | 
               | Hopefully that's a hint that you should attempt to figure
               | out what someone might be talking about before going to
               | schoolyard insults.
        
         | mr_mitm wrote:
         | Yes, very simple, except that when physicists say "tensor",
         | they mean tensor fields, on smooth, curved manifolds, in at
         | least four dimensions, often with a Lorentz metric. Things stop
         | being simple quickly.
        
           | adrian_b wrote:
           | It does not matter on what set a tensor field is defined.
           | 
           | A tensor field is not a tensor, but the value of a tensor
           | field at any point is a tensor, which satisfies the
           | definition given above, exactly like the value of a vector
           | field at any point is a vector.
           | 
           | The "fields" are just functions.
           | 
           | There are physics books that do not give the easier to
           | understand definition given above, but they give an
           | equivalent, but more obscure, definition of a tensor, by
           | giving the transformation rules for its contravariant
           | components and for its covariant components at a change of
           | the reference system.
           | 
           | The word "tensor" with the current meaning has been used for
           | the first time by Einstein and he has not given any
           | explanation for this word choice. The theory of tensors that
           | Einstein has learned had not used the word "tensor".
           | 
           | Before Einstein, the word "tensor" (coined by Hamilton) was
           | used in physics with the meaning of "symmetrical matrix",
           | because the geometric (affine) transformation of a body that
           | is determined by the multiplication with a symmetric matrix
           | extends (or compresses) the body towards certain directions
           | (the axes that correspond to a rotation that would
           | diagonalize the symmetric matrix). The word "tensor" in the
           | old sense was applied only to what is called now "symmetric
           | tensor of the second order" (which remains the most important
           | kind of the tensors that are neither vectors nor scalars).
        
             | prof-dr-ir wrote:
             | > The "fields" are just functions.
             | 
             | I think this is far too simplistic, for one because the
             | values of this putative function depend on the chosen
             | coordinate system.
             | 
             | So I completely agree with the comment you are replying to:
             | when a physicist says "tensor" they really mean a "tensor
             | field" and the definition of the latter is quite a bit more
             | involved than just specifying a multilinear map at each
             | point of a manifold.
        
               | mr_mitm wrote:
               | Plus, as if tensor fields on Lorentz manifolds weren't
               | already complicated enough, physicists aren't happy until
               | they can write down some differential equations. So not
               | only are you doing calculus, you're doing it on curved
               | manifolds, with complicated tensor objects, in the
               | context of partial differential equations, which - in the
               | case of general relativity - are non-linear. It's okay to
               | admit that all of this is a bit hard. Hell, as the
               | article points out, Einstein himself had trouble
               | understanding them.
        
               | adrian_b wrote:
               | The values of the "putative function" do not depend on
               | the chosen coordinate system.
               | 
               | This is the essence of notions like scalar, vector,
               | tensor, that they do not depend on the chosen coordinate
               | system.
               | 
               | Only their numeric representations associated with a
               | chosen coordinate system do depend on that system.
               | 
               | If you compute some arbitrary functions of the numeric
               | components of a tensor in a certain coordinate system, in
               | most cases the array of numbers that composes the result
               | will not be a tensor, precisely because the result will
               | really be different in any other coordinate system, while
               | a tensor must be invariant.
               | 
               | All physical laws are formulated only using various kinds
               | of tensors, including vectors and scalars, precisely
               | because they must be invariant at the choice of the
               | coordinate system.
        
             | abdullahkhalids wrote:
             | > There are physics books that do not give the easier to
             | understand definition given above, but they give an
             | equivalent, but more obscure, definition of a tensor, by
             | giving the transformation rules for its contravariant
             | components and for its covariant components at a change of
             | the reference system.
             | 
             | The definition of a tensor as linear maps, while simple to
             | understand, has no content that is useful for doing
             | physics. To do any physics, or for that matter, any
             | geometry with tensors, you need to define the notion of
             | covariance and contravariance.
             | 
             | Besides, the starting with the latter notions allow you to
             | define tensors more naturally. You start with trying to
             | understand how geometric objects transform under coordinate
             | transformations, and you slowly but surely end up with
             | tensors.
        
               | tacomonstrous wrote:
               | Covariance and contravariance are mathematical notions,
               | and have to do with whether each multiplicative
               | constituent or the tensor is a vector in your given
               | vector space (covariant) or a linear functional on this
               | space (contravariant). There is no inherent physical
               | meaning to either concept.
        
               | adrian_b wrote:
               | No, the definition of a tensor as a linear map, is the
               | only definition that is useful for doing physics.
               | 
               | All the physical quantities that are defined to be
               | tensors are quantities used to transform either vectors
               | into other vectors or tensors of higher orders into other
               | tensors of higher orders (for instance the transformation
               | between the electric field vector and the electric
               | polarization vector).
               | 
               | Therefore all such physical quantities are used to
               | describe multilinear functions, either in linear
               | anisotropic media, or in non-linear anisotropic media,
               | but in the latter case they are applicable only to
               | relations between small differences, where linear
               | approximations may be used.
               | 
               | The multilinear function is the physical concept that is
               | independent of the coordinate system. The concrete
               | computations with a tensor a.k.a. multilinear function
               | may need the computation of contravariant and/or
               | covariant components in a particular coordinate system
               | and the use of their transformation rules. On the other
               | hand, the abstract formulation of the physical laws does
               | not need such details, but only the high-level
               | definitions using multi-linear functions, and it is
               | independent of any choice for the coordinate system.
               | 
               | There is a unique multilinear function a.k.a. tensor, but
               | it can be expressed by an infinity of different arrays of
               | numbers, corresponding to various combinations of
               | contravariant or covariant components, in various
               | coordinate systems. Their transformation rules can be
               | determined by the condition that they must represent the
               | same function. In the books that do not explain this, the
               | rules appear to be magic and they do not allow an
               | understanding of why the rules are these and not others.
        
             | cfgauss2718 wrote:
             | Very well, let's just agree that in physics (r,s) tensors
             | usually refer to sections of the tensor product of some
             | fixed number of copies of the tangent bundle (r copies) and
             | cotangent bundle (s copies) of a smooth manifold (almost
             | always pseudo-Riemannian) and leave it there. Elementary!
        
         | DemocracyFTW2 wrote:
         | A monad is just a monoid in the category of endofunctors,
         | what's the problem?
        
         | bratwurst3000 wrote:
         | you could also call it a matrix to modify a matrix in a vector
         | space. not 100% correct. xD
        
         | bjourne wrote:
         | Well, I can write a definition that is both easier to
         | understand and shorter than yours:
         | 
         | A tensor is a multi-dimensional array.
         | 
         | :)
        
           | eigenket wrote:
           | This only works in finite dimensions, which for
           | mathematicians excludes pretty much all of the interesting
           | cases.
        
           | ithinkso wrote:
           | This is actually a harmful definition, both (1,1) and (0,2)
           | tensors can be written as a matrix but they are _very_
           | different. It 's like calling vector an array but vectors
           | require vector space and arrays are just arrays. It doesn't
           | help that std::vector is very common in CS but 'pushing back'
           | to a mathematical vector just doesn't make any sense
        
         | ndriscoll wrote:
         | I think the people who find this definition to be mysterious
         | are really looking for (borrowing from Ravi Vakil[0]) "why is a
         | tensor" rather than "what is a tensor". In that case, a better
         | answer IMO is that it's the "most generic" way to multiply
         | vectors that's compatible with the linear structure: "v times
         | w" is _defined to be the symbol_ "v[?]w". There is no meaning
         | to that symbol.
         | 
         | But these things are vectors, so you could write e.g. v =
         | a[?]x+b[?]y, and then you want e.g. (a[?]x+b[?]y)[?]w = ax[?]w
         | + by[?]w, and so on.
         | 
         | So in some sense, the quotient space construction[1] gives a
         | better "why". It says
         | 
         | * I want to multiply vectors in V and W. So let's just start by
         | writing down that "v times w" is the symbol "v[?]w", and I want
         | to have a vector space, so take the vector space generated by
         | all of these symbols.
         | 
         | * But I also want that (v_1+v_2)[?]w = v_1[?]w + v_2[?]w
         | 
         | * And I also want that v[?](w_1+w_2) = v[?]w_1 + v[?]w_2
         | 
         | * And I also want that (sv)[?]w = s(v[?]w) = v[?](sw)
         | 
         | And that's it. However you want to concretely define tensors,
         | they ought to be "a way to multiply vectors that follows those
         | rules". Quotienting is a generic technique to say "start with
         | this object, and add this additional rule while keeping all of
         | the others".
         | 
         | Another way to say this is that the tensor algebra is the "free
         | associative algebra": it's a way to multiply vectors where the
         | only rules you have to reduce expressions are the ones you
         | needed to have.
         | 
         | [0] https://www.youtube.com/live/mqt1f8owKrU?t=500
         | 
         | [1]
         | https://en.wikipedia.org/wiki/Tensor_product#As_a_quotient_s...
        
       | max_likelihood wrote:
       | I've always thought the use of "Tensor" in the "TensorFlow"
       | library is a misnomer. I'm not too familiar with ML/theory, is
       | there a deeper geometric meaning to the multi-dimensional array
       | of numbers we are multiplying or is "MatrixFlow" a more
       | appropriate name?
        
         | itishappy wrote:
         | The tensors in tensorflow are often higher dimensional. Is a 3d
         | block of numbers (say 1920x1080x3) still a matrix? I would
         | argue it's not. Are there transformation rules for matrices?
         | 
         | You're totally correct that the tensors in tensorflow do drop
         | the geometric meaning, but there's precedence there from how CS
         | vs math folk use vectors.
        
         | andrewla wrote:
         | Matrices are strictly two-dimensional arrays (together with
         | some other properties, but for a computer scientist that's it).
         | Tensors are the generalization to higher dimensional arrays.
        
         | MathMonkeyMan wrote:
         | The joke I learned in a Physics course is "a vector is
         | something that transforms like a vector," and "a tensor is
         | something that transforms like a tensor." It's true, though.
         | 
         | The physicist's tensor is a matrix of functions of coordinates
         | that transform in a prescribed way when the coordinates are
         | transformed. It's a particular application of the chain rule
         | from calculus.
         | 
         | I don't know why the word "tensor" is used in other contexts.
         | Google says that the etymology of the word is:
         | 
         | > early 18th century: modern Latin, from Latin tendere 'to
         | stretch'.
         | 
         | So maybe the different senses of the word share the analogy of
         | scaling matrices.
        
           | ogogmad wrote:
           | The mathematical definition is 99% equivalent to the physical
           | one. I find that the physical one helps to motivate the
           | mathematical one by illustrating the numerical difference
           | between the basis-change transformation for (1,0)- and
           | (0,1)-tensors. The mathematical one is then simpler and more
           | conceptual once you've understood that motivation. The
           | concept of a tensor really belongs to linear algebra, but
           | occurs mostly in differential geometry.
           | 
           | There is still a "1% difference" in meaning though. This
           | difference allows a physicist to say "the Christoffel symbols
           | are _not_ a tensor ", while a mathematician would say this is
           | a conflation of terms.
           | 
           | TensorFlow's terminology is based on the rule of thumb that a
           | "vector" is really a 1D array (think column vector), a
           | "matrix" is really a 2D array, and a "tensor" is then an nD
           | array. That's it. This is offensive to physicists especially,
           | but -\\_(tsu)_/-
        
             | bjourne wrote:
             | The problem with the physicist's definition is that the
             | larger the N the less the geometrical interpretation makes
             | sense. For 1, 2, and even 3-dimensional tensors there is
             | some connection to geometry, but eventually it loses all
             | meaning. Physicist has to give up and "admit" that an
             | N-dimensional tensor really just is a collection of
             | N-1-dimensional tensors.
        
         | blt wrote:
         | There is no geometric meaning. It's a really bad name.
        
         | dannymi wrote:
         | In the first example on
         | https://www.tensorflow.org/api_docs/python/tf/math/multiply you
         | can see that they use the Hadamard product (not the matrix
         | product):                   x = tf.constant(([1, 2, 3, 4]))
         | tf.math.multiply(x, x)         <tf.Tensor: shape=(4,),
         | dtype=..., numpy=array([ 1,  4,  9, 16], dtype=int32)>
         | 
         | I could stop right here since it's a counterexample to x being
         | a matrix (with a matrix product defined on it; P.S. try
         | tf.matmul(x, x)--it will fail; there's no .transpose either).
         | But that's only technically correct :)
         | 
         | So let's look at tensorflow some more:
         | 
         | The tensorflow tensors should transform like vectors would
         | under change of coordinate system.
         | 
         | In order to see that, let's do a change of coordinate system.
         | To summarize the stuff below: If L1 and W12 are indeed tensors,
         | it should be true that A L1 W12 A^-1 = L1 W12.
         | 
         | Try it (in tensorflow) and see whether the new tensor obeys the
         | tensor laws after the transformation. Interpret the changes to
         | the nodes as covariant and the changes to the weights as
         | contravariant:                   import tensorflow as tf
         | # Initial outputs of one layer of nodes in your neural network
         | L1 = tf.constant([2.5, 4, 1.2], dtype=tf.float32)         # Our
         | evil transformation matrix (coordinate system change)         A
         | = tf.constant([[2, 0, 0], [0, 1, 0], [0, 0, 0.2]],
         | dtype=tf.float32)         # Weights (no particular values;
         | "random")         W12 = tf.constant(             [[-1, 0.4,
         | 1.5],              [0.8, 0.5, 0.75],              [0.2, -0.3,
         | 1]], dtype=tf.float32         )         # Covariant tensor
         | nature; varying with the nodes         L1_covariant =
         | tf.matmul(A, tf.reshape(L1, [3, 1]))         A_inverse =
         | tf.linalg.inv(A)         # Contravariant tensor nature; varying
         | against the nodes         W12_contravariant = tf.matmul(W12,
         | A_inverse)         # Now derive the inputs for the next layer
         | using the transformed node outputs and weights         L2 =
         | tf.matmul(W12_contravariant, L1_covariant)         # Compare to
         | the direct way         L2s = tf.matmul(W12, tf.reshape(L1, [3,
         | 1]))         #assert L2 == L2s
         | 
         | A tensor (like a vector) is actually a very low-level object
         | from the standpoint of linear algebra. It's not hard at all to
         | make something a tensor. Think of it like geometric "assembly
         | language".
         | 
         | In comparison, a matrix is rank 2 (and not all matrices
         | represent tensors). That's it. No rank 3, rank 4, rank 1 (!!).
         | So what does a matrix help you, really?
         | 
         | If you mean that the operations in tensorflow (and numpy before
         | it) aren't beautiful or natural, I agree. It still works,
         | though. If you want to stick to ascii and have no indices on
         | names, you can't do much better (otherwise, use
         | Cadabra[1]--which is great). For example, it was really
         | difficult to write the stuff above without using indices and
         | it's really not beautiful this way :(
         | 
         | More detail on https://medium.com/@quantumsteinke/whats-the-
         | difference-betw...
         | 
         | See also http://singhal.info/ieee2001.pdf for a primer on
         | information science, including its references, for vector
         | spaces with an inner product that are usually used in ML. The
         | latter are definitely geometry.
         | 
         | [1] https://cadabra.science/ (also in mogan or texmacs) -
         | Einstein field equations also work there and are beautiful
        
           | andrewla wrote:
           | In TensorFlow the tf.matmul function or the @ operator
           | perform matrix multiplication. Element-wise multiplication
           | ends up being useful for a lot of paralellizable computation
           | but should not be confused with matrix multiplication.
        
         | twothreeone wrote:
         | I agree. Just like NumPy's Einsum. "Multi-Array Flow" doesn't
         | sound sexy and associating your project with a renowned
         | physicist's name gives your project that "we solve big science
         | problems" vibe by association. Very pretentious, very
         | predictable, and very cringe.
        
         | adrian_b wrote:
         | Since the beginning of computer technology, "array" is the term
         | that has been used for any multi-dimensional array, with
         | "vectors" and "matrices" being special kinds of arrays. An
         | exception was COBOL, which had a completely different
         | terminology in comparison with the other programming languages
         | of that time. Among the long list of differences between COBOL
         | and the rest were e.g. "class" instead of "type" and "table"
         | instead of "array". Some of the COBOL terminology has been
         | inherited by languages like SQL or Simula 67 (hence the use of
         | "class" in OOP languages).
         | 
         | A "tensor", as used in mathematics in physics is not any array,
         | but it is a special kind of array, which is associated with a
         | certain coordinate system and which is transformed by special
         | rules whenever the coordinate system is changed.
         | 
         | The "tensor" in TensorFlow is a fancy name for what should be
         | called just "array". When an array is bidimensional, "matrix"
         | is an appropriate name for it.
        
       | openrisk wrote:
       | > Talk to a computer scientist, and they might tell you that a
       | tensor is an array of numbers that stores important data
       | 
       | The conflicting definitions of tensors have precedent in lower
       | dimensions: _vectors_ were already being used in computer science
       | to mean something different than in mathematics  / physics, long
       | before the current tensormania.
       | 
       | Its not clear if that ambiguity will ever be a practical problem
       | though. For as long as such structures are containers of
       | numerical data with no implied transformation properties we are
       | really talking about two different universes.
       | 
       | Things might get interesting though in the overlap between
       | information technology and geometry [1] :-)
       | 
       | [1] https://en.wikipedia.org/wiki/Information_geometry
        
       | nyrikki wrote:
       | I would argue that today, geometric algebra/Clifford calculus and
       | space time algebra are more intuitive and useful.
       | 
       | Gibbs/Heavysides vectors were more popular at the time.
       | 
       | At least for me.
        
         | eigenket wrote:
         | Nah, at least in the context of General Relativity tensors and
         | tensor fields are everything. Geometric algebra doesn't add
         | much.
        
       | bollu wrote:
       | I've written about [this explanation of tensors](https://pixel-
       | druid.com/articles/tensor-is-a-thing-that-tran...) before, and it
       | seems worthwhile to write it down again:
       | 
       | There are two ways of using linear maps in the context of
       | physics. One is as a thing that acts on the space . The other is
       | a thing that acts on the coordinates . So when we talk about
       | transformations in tensor analysis, we're talking about
       | coordinate transformatios , not space transformations . Suppose I
       | implement a double ended queue using two pointers:
       | 
       | ``` struct Queue {int _memory,_ start, _end; } void
       | queue_init(int size) { memory = malloc(sizeof(int)_ size); start
       | = end = memory + (size - 1)  / 2; } void queue_push_start(int x)
       | { _start = x; start--; } void queue_push_end(int x) { end++;_ end
       | = x; } int queue_head() { return _start; } int queue_tail() {
       | return_ end; } void queue_deque_head() { start++; } void
       | queue_deque_tail() { tail--; } ```
       | 
       | See that the state of the queue is technically three numbers, {
       | memory, start, end } (Pointers are just numbers after all). But
       | this is coordinate dependent , as start and end are relative to
       | the location of memory. Now suppose I have a procedure to
       | reallocate the queue size:
       | 
       | ``` void queue_realloc(Queue _q, int new_size) { int start_offset
       | = q- >memory - q->start; int end_offset = q->memory - q->end; int
       | _oldmem = q->memory; q->memory = realloc(q->memory, new_size);
       | memcpy(q->memory, oldmem + q->start, sizeof(int) * (end_offset -
       | start_offset); q->start = q->memory + start_offset; q->end =
       | q->memory - end_offset; }
       | 
       | ```
       | 
       | Notice that when I do this, the values of start and end can be
       | completely different! However, see that the length of the queue,
       | given by (end - start) is invariant : It hasn't changed!
       | 
       | ---
       | 
       | In the exact same way, a "tensor" is a collection of numbers that
       | describes something physical with respect to a particular
       | coordinate system (the pointers start and end with respect to the
       | memory coordinate system). "tensor calculus" is a bunch of rules
       | that tell you how the numbers change when one changes coordinate
       | systems (ie, how the pointers start and end change when the
       | pointer memory changes). Some quantities that are computed from
       | tensors are "physical", like the length of the queue, as they are
       | invariant under transformations. Tensor calculus gives a
       | principled way to make sure that the final answers we calculate
       | are "invariant" / "physical" / "real". The actual locations of
       | start and end don't matter, as (end - start) will always be the
       | length of the list!
       | 
       | ---
       | 
       | Physicists (and people who write memory allocators) need such
       | elaborate tracking, to keep track of what is "real" and what is
       | "coordinate dependent", since a lot of physics involves crazy
       | coordinate systems , and having ways to know what things are real
       | and what are artefacts of one's coordinate system is invaluable.
       | For a real example, consider the case of singularities of the
       | Schwarzschild solution to GR, where we initially thought there
       | were two singularities, but it later turned out there was only
       | one "real" singularity, and the other singularity was due to a
       | poor choice of coordinate system:
       | 
       | Although there was general consensus that the singularity at r =
       | 0 was a 'genuine' physical singularity, the nature of the
       | singularity at r = rs remained unclear. In 1921 Paul Painleve and
       | in 1922 Allvar Gullstrand independently produced a metric, a
       | spherically symmetric solution of Einstein's equations, which we
       | now know is coordinate transformation of the Schwarzschild
       | metric, Gullstrand-Painleve coordinates, in which there was no
       | singularity at r = rs. They, however, did not recognize that
       | their solutions were just coordinate transform
        
       | ijidak wrote:
       | Here is a video series on tensors I've enjoyed:
       | https://youtube.com/playlist?list=PLJHszsWbB6hrkmmq57lX8BV-o...
       | 
       | And this series by Dialect:
       | https://youtube.com/playlist?list=PL__fY7tXwodmfntSAAyBDxZ4_...
        
       | mvaliente2001 wrote:
       | The idea of tensors as "a matrix of numbers" or the example of a
       | cube with vectors on every face never clicked for me. It was this
       | (NASA paper)[https://www.grc.nasa.gov/www/k-12/Numbers/Math/docum
       | ents/Ten...] what finally brought me clarity. The main idea, as
       | others already commented, is that a tensor or rank n is a
       | function that can be applied up to n vector, reducing its rank by
       | one for each vector it consumes.
        
         | cryptonector wrote:
         | > a tensor or rank n is a function that can be applied up to n
         | vector
         | 
         | There seems to be a grammar problem here.
        
         | qsdf38100 wrote:
         | In your cube example you are using the word "vector" to refer
         | to faces of the cube. Did you mean matrix?
         | 
         | My understanding is that the cube is a rank 3 tensor, the faces
         | (or rather slices) of the cube are rank 2 tensors (aka
         | matrices), and the edges (slices) of the matrices are rank 1
         | tensors (aka vectors).
        
       ___________________________________________________________________
       (page generated 2024-08-12 23:01 UTC)