LSTM sequence prediction: 3d input to 2d outputHow to use Embedding() with 3D tensor in Keras?Input for LSTM...

Single-row INSERT...SELECT much slower than separate SELECT

A starship is travelling at 0.9c and collides with a small rock. Will it leave a clean hole through, or will more happen?

Can we "borrow" our answers to populate our own websites?

Coworker asking me to not bring cakes due to self control issue. What should I do?

Sitecore 9.1 Installation - Skip to particular step

Potential client has a problematic employee I can't work with

Boss asked me to sign a resignation paper without a date on it along with my new contract

Critique vs nitpicking

Is `Object` a function in javascript?

Is there any advantage in specifying './' in a for loop using a glob?

The No-Straight Maze

Eww, those bytes are gross

Concatenating two int[]

Does the ditching switch allow an A320 to float indefinitely?

Do authors have to be politically correct in article-writing?

Why is it that Bernie Sanders is always called a "socialist"?

Website seeing my Facebook data?

What to do with threats of blacklisting?

If angels and devils are the same species, why would their mortal offspring appear physically different?

How is this property called for mod?

"Starve to death" Vs. "Starve to the point of death"

Renting a 2CV in France

What is the wife of a henpecked husband called?

What is the industry term for house wiring diagrams?



LSTM sequence prediction: 3d input to 2d output


How to use Embedding() with 3D tensor in Keras?Input for LSTM for financial time series directional predictionHow to set input for proper fit with lstm?For stateful LSTM, does sequence length matter?TimeDistributed with different input / output sequence lengthLSTM training/prediction with no starting sequenceWhy the RNN has input shape error?Input sequence ordering for LSTM networkKeras/TF: Making sure image training data shape is accurate for Time Distributed CNN+LSTMUnderstanding LSTM structure













0












$begingroup$


I'm working with this LSTM model



model = Sequential()
model.add(Masking(mask_value=0, input_shape=(timesteps, features)))
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2, return_sequences=False))
model.add(Dense(features, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])


and shapes X_train (21, 11, 5), y_train (21, 5).



Each timestep is represented by 5 features.
return_sequences is set to False because I want to predict one 5D array (the next timestep) for each input sequence of 11 timesteps.



I get the error
ValueError: Error when checking target: expected dense_1 to have 3 dimensions, but got array with shape (14, 5).



If I reshape the data as X_train (21, 11, 5), y_train (21, 1, 5) instead I get the errorValueError: Invalid shape for y: (14, 1, 5).



How should I solve this problem?










share|improve this question









New contributor




ginevracoal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$

















    0












    $begingroup$


    I'm working with this LSTM model



    model = Sequential()
    model.add(Masking(mask_value=0, input_shape=(timesteps, features)))
    model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2, return_sequences=False))
    model.add(Dense(features, activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])


    and shapes X_train (21, 11, 5), y_train (21, 5).



    Each timestep is represented by 5 features.
    return_sequences is set to False because I want to predict one 5D array (the next timestep) for each input sequence of 11 timesteps.



    I get the error
    ValueError: Error when checking target: expected dense_1 to have 3 dimensions, but got array with shape (14, 5).



    If I reshape the data as X_train (21, 11, 5), y_train (21, 1, 5) instead I get the errorValueError: Invalid shape for y: (14, 1, 5).



    How should I solve this problem?










    share|improve this question









    New contributor




    ginevracoal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$















      0












      0








      0





      $begingroup$


      I'm working with this LSTM model



      model = Sequential()
      model.add(Masking(mask_value=0, input_shape=(timesteps, features)))
      model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2, return_sequences=False))
      model.add(Dense(features, activation='softmax'))
      model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])


      and shapes X_train (21, 11, 5), y_train (21, 5).



      Each timestep is represented by 5 features.
      return_sequences is set to False because I want to predict one 5D array (the next timestep) for each input sequence of 11 timesteps.



      I get the error
      ValueError: Error when checking target: expected dense_1 to have 3 dimensions, but got array with shape (14, 5).



      If I reshape the data as X_train (21, 11, 5), y_train (21, 1, 5) instead I get the errorValueError: Invalid shape for y: (14, 1, 5).



      How should I solve this problem?










      share|improve this question









      New contributor




      ginevracoal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      I'm working with this LSTM model



      model = Sequential()
      model.add(Masking(mask_value=0, input_shape=(timesteps, features)))
      model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2, return_sequences=False))
      model.add(Dense(features, activation='softmax'))
      model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])


      and shapes X_train (21, 11, 5), y_train (21, 5).



      Each timestep is represented by 5 features.
      return_sequences is set to False because I want to predict one 5D array (the next timestep) for each input sequence of 11 timesteps.



      I get the error
      ValueError: Error when checking target: expected dense_1 to have 3 dimensions, but got array with shape (14, 5).



      If I reshape the data as X_train (21, 11, 5), y_train (21, 1, 5) instead I get the errorValueError: Invalid shape for y: (14, 1, 5).



      How should I solve this problem?







      lstm multilabel-classification recurrent-neural-net






      share|improve this question









      New contributor




      ginevracoal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor




      ginevracoal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question








      edited 3 hours ago







      ginevracoal













      New contributor




      ginevracoal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked Feb 22 at 9:14









      ginevracoalginevracoal

      1085




      1085




      New contributor




      ginevracoal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      ginevracoal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      ginevracoal is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          1 Answer
          1






          active

          oldest

          votes


















          0












          $begingroup$

          What are your features like? Given that you have a Dense layer outputting a softmax of size 5, this implies that all you want to predict is 1 feature, a categorical feature with 5 options.



          If this is not true, more information about the features is needed to help here.



          Your Y-variable for each feature should be of size (num_samples, time_step_len, num_categories_of_feature). You need to one-hot-encode each categorical feature separately, which gives the last dimension size, num_categories_of_feature. As you have it currently, the Y_train size is (num_samples, features). So, as you have the problem framed, the network has no way to learn the sequence patterns, as you only give it the end result. You should create your Y_train data to be the true value for the next time-step, for every time-step. Hence, (num_samples, time_step_len, num_categories_of_feature). Side note: I've only worked with LSTMs/RNN's on one problem, and this is how I did it. I cared about learning the sequences in it's entirety, because my inputs at prediction time are variable. If you always have 11 time-steps and always just want the next time-step prediction, this might not apply. I really don't know to be honest.



          This is where I'm not totally sure if this is the only way to do this, but the way I think of this problem for wanting to predict 5 categorical variables, you need a way to output softmaxs for each variable. A softmax activation of size "features", like you have it here, is estimating a probability distribution of size 5, which implies your Y variable is only 1 categorical feature that has 5 potential values. So, you will need to set up your network to have 5 outputs with independent softmax outputs the size equal to the number of categories for each variable. A single softmax should only be used to estimate a distribution over a single class variable. 10 options for feat1? Softmax of size 10. etc.



          losses = {"feat1_output": "categorical_crossentropy", "feat2_output": "categorical_crossentropy", "feat3_output": "categorical_crossentropy", "feat4_output": "categorical_crossentropy", "feat5_output": "categorical_crossentropy"}
          lossWeights = {"feat1_output": 1.0, "feat2_output": 1.0, ... , ...}# if everything is equal, dont worry about specifying loss weights.
          metrics = {"feat1_output": "categorical_accuracy", "feat2_output": "categorical_accuracy", "feat3_output": "categorical_accuracy", "feat4_output": "categorical_accuracy", "feat5_output": "categorical_accuracy"}
          opt = Adam(lr=init_lr,decay=init_lr / num_epochs)
          model.compile(loss = losses, loss_weights = lossWeights, optimizer=opt, metrics=metrics)


          Now, you will be optimizing 5 loss functions at the same time, one for each categorical prediction. You must now have 5 Y-variable datasets, each of size (num_samples, time_step_len, num_categories_of_feature). You will then give 5 y datasets for the outputs in the fit function, as a list. However, to properly name the output layers, you will need to specify the names for the output layers in the model definition.






          share|improve this answer









          $endgroup$













            Your Answer





            StackExchange.ifUsing("editor", function () {
            return StackExchange.using("mathjaxEditing", function () {
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            });
            });
            }, "mathjax-editing");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "557"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });






            ginevracoal is a new contributor. Be nice, and check out our Code of Conduct.










            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46006%2flstm-sequence-prediction-3d-input-to-2d-output%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0












            $begingroup$

            What are your features like? Given that you have a Dense layer outputting a softmax of size 5, this implies that all you want to predict is 1 feature, a categorical feature with 5 options.



            If this is not true, more information about the features is needed to help here.



            Your Y-variable for each feature should be of size (num_samples, time_step_len, num_categories_of_feature). You need to one-hot-encode each categorical feature separately, which gives the last dimension size, num_categories_of_feature. As you have it currently, the Y_train size is (num_samples, features). So, as you have the problem framed, the network has no way to learn the sequence patterns, as you only give it the end result. You should create your Y_train data to be the true value for the next time-step, for every time-step. Hence, (num_samples, time_step_len, num_categories_of_feature). Side note: I've only worked with LSTMs/RNN's on one problem, and this is how I did it. I cared about learning the sequences in it's entirety, because my inputs at prediction time are variable. If you always have 11 time-steps and always just want the next time-step prediction, this might not apply. I really don't know to be honest.



            This is where I'm not totally sure if this is the only way to do this, but the way I think of this problem for wanting to predict 5 categorical variables, you need a way to output softmaxs for each variable. A softmax activation of size "features", like you have it here, is estimating a probability distribution of size 5, which implies your Y variable is only 1 categorical feature that has 5 potential values. So, you will need to set up your network to have 5 outputs with independent softmax outputs the size equal to the number of categories for each variable. A single softmax should only be used to estimate a distribution over a single class variable. 10 options for feat1? Softmax of size 10. etc.



            losses = {"feat1_output": "categorical_crossentropy", "feat2_output": "categorical_crossentropy", "feat3_output": "categorical_crossentropy", "feat4_output": "categorical_crossentropy", "feat5_output": "categorical_crossentropy"}
            lossWeights = {"feat1_output": 1.0, "feat2_output": 1.0, ... , ...}# if everything is equal, dont worry about specifying loss weights.
            metrics = {"feat1_output": "categorical_accuracy", "feat2_output": "categorical_accuracy", "feat3_output": "categorical_accuracy", "feat4_output": "categorical_accuracy", "feat5_output": "categorical_accuracy"}
            opt = Adam(lr=init_lr,decay=init_lr / num_epochs)
            model.compile(loss = losses, loss_weights = lossWeights, optimizer=opt, metrics=metrics)


            Now, you will be optimizing 5 loss functions at the same time, one for each categorical prediction. You must now have 5 Y-variable datasets, each of size (num_samples, time_step_len, num_categories_of_feature). You will then give 5 y datasets for the outputs in the fit function, as a list. However, to properly name the output layers, you will need to specify the names for the output layers in the model definition.






            share|improve this answer









            $endgroup$


















              0












              $begingroup$

              What are your features like? Given that you have a Dense layer outputting a softmax of size 5, this implies that all you want to predict is 1 feature, a categorical feature with 5 options.



              If this is not true, more information about the features is needed to help here.



              Your Y-variable for each feature should be of size (num_samples, time_step_len, num_categories_of_feature). You need to one-hot-encode each categorical feature separately, which gives the last dimension size, num_categories_of_feature. As you have it currently, the Y_train size is (num_samples, features). So, as you have the problem framed, the network has no way to learn the sequence patterns, as you only give it the end result. You should create your Y_train data to be the true value for the next time-step, for every time-step. Hence, (num_samples, time_step_len, num_categories_of_feature). Side note: I've only worked with LSTMs/RNN's on one problem, and this is how I did it. I cared about learning the sequences in it's entirety, because my inputs at prediction time are variable. If you always have 11 time-steps and always just want the next time-step prediction, this might not apply. I really don't know to be honest.



              This is where I'm not totally sure if this is the only way to do this, but the way I think of this problem for wanting to predict 5 categorical variables, you need a way to output softmaxs for each variable. A softmax activation of size "features", like you have it here, is estimating a probability distribution of size 5, which implies your Y variable is only 1 categorical feature that has 5 potential values. So, you will need to set up your network to have 5 outputs with independent softmax outputs the size equal to the number of categories for each variable. A single softmax should only be used to estimate a distribution over a single class variable. 10 options for feat1? Softmax of size 10. etc.



              losses = {"feat1_output": "categorical_crossentropy", "feat2_output": "categorical_crossentropy", "feat3_output": "categorical_crossentropy", "feat4_output": "categorical_crossentropy", "feat5_output": "categorical_crossentropy"}
              lossWeights = {"feat1_output": 1.0, "feat2_output": 1.0, ... , ...}# if everything is equal, dont worry about specifying loss weights.
              metrics = {"feat1_output": "categorical_accuracy", "feat2_output": "categorical_accuracy", "feat3_output": "categorical_accuracy", "feat4_output": "categorical_accuracy", "feat5_output": "categorical_accuracy"}
              opt = Adam(lr=init_lr,decay=init_lr / num_epochs)
              model.compile(loss = losses, loss_weights = lossWeights, optimizer=opt, metrics=metrics)


              Now, you will be optimizing 5 loss functions at the same time, one for each categorical prediction. You must now have 5 Y-variable datasets, each of size (num_samples, time_step_len, num_categories_of_feature). You will then give 5 y datasets for the outputs in the fit function, as a list. However, to properly name the output layers, you will need to specify the names for the output layers in the model definition.






              share|improve this answer









              $endgroup$
















                0












                0








                0





                $begingroup$

                What are your features like? Given that you have a Dense layer outputting a softmax of size 5, this implies that all you want to predict is 1 feature, a categorical feature with 5 options.



                If this is not true, more information about the features is needed to help here.



                Your Y-variable for each feature should be of size (num_samples, time_step_len, num_categories_of_feature). You need to one-hot-encode each categorical feature separately, which gives the last dimension size, num_categories_of_feature. As you have it currently, the Y_train size is (num_samples, features). So, as you have the problem framed, the network has no way to learn the sequence patterns, as you only give it the end result. You should create your Y_train data to be the true value for the next time-step, for every time-step. Hence, (num_samples, time_step_len, num_categories_of_feature). Side note: I've only worked with LSTMs/RNN's on one problem, and this is how I did it. I cared about learning the sequences in it's entirety, because my inputs at prediction time are variable. If you always have 11 time-steps and always just want the next time-step prediction, this might not apply. I really don't know to be honest.



                This is where I'm not totally sure if this is the only way to do this, but the way I think of this problem for wanting to predict 5 categorical variables, you need a way to output softmaxs for each variable. A softmax activation of size "features", like you have it here, is estimating a probability distribution of size 5, which implies your Y variable is only 1 categorical feature that has 5 potential values. So, you will need to set up your network to have 5 outputs with independent softmax outputs the size equal to the number of categories for each variable. A single softmax should only be used to estimate a distribution over a single class variable. 10 options for feat1? Softmax of size 10. etc.



                losses = {"feat1_output": "categorical_crossentropy", "feat2_output": "categorical_crossentropy", "feat3_output": "categorical_crossentropy", "feat4_output": "categorical_crossentropy", "feat5_output": "categorical_crossentropy"}
                lossWeights = {"feat1_output": 1.0, "feat2_output": 1.0, ... , ...}# if everything is equal, dont worry about specifying loss weights.
                metrics = {"feat1_output": "categorical_accuracy", "feat2_output": "categorical_accuracy", "feat3_output": "categorical_accuracy", "feat4_output": "categorical_accuracy", "feat5_output": "categorical_accuracy"}
                opt = Adam(lr=init_lr,decay=init_lr / num_epochs)
                model.compile(loss = losses, loss_weights = lossWeights, optimizer=opt, metrics=metrics)


                Now, you will be optimizing 5 loss functions at the same time, one for each categorical prediction. You must now have 5 Y-variable datasets, each of size (num_samples, time_step_len, num_categories_of_feature). You will then give 5 y datasets for the outputs in the fit function, as a list. However, to properly name the output layers, you will need to specify the names for the output layers in the model definition.






                share|improve this answer









                $endgroup$



                What are your features like? Given that you have a Dense layer outputting a softmax of size 5, this implies that all you want to predict is 1 feature, a categorical feature with 5 options.



                If this is not true, more information about the features is needed to help here.



                Your Y-variable for each feature should be of size (num_samples, time_step_len, num_categories_of_feature). You need to one-hot-encode each categorical feature separately, which gives the last dimension size, num_categories_of_feature. As you have it currently, the Y_train size is (num_samples, features). So, as you have the problem framed, the network has no way to learn the sequence patterns, as you only give it the end result. You should create your Y_train data to be the true value for the next time-step, for every time-step. Hence, (num_samples, time_step_len, num_categories_of_feature). Side note: I've only worked with LSTMs/RNN's on one problem, and this is how I did it. I cared about learning the sequences in it's entirety, because my inputs at prediction time are variable. If you always have 11 time-steps and always just want the next time-step prediction, this might not apply. I really don't know to be honest.



                This is where I'm not totally sure if this is the only way to do this, but the way I think of this problem for wanting to predict 5 categorical variables, you need a way to output softmaxs for each variable. A softmax activation of size "features", like you have it here, is estimating a probability distribution of size 5, which implies your Y variable is only 1 categorical feature that has 5 potential values. So, you will need to set up your network to have 5 outputs with independent softmax outputs the size equal to the number of categories for each variable. A single softmax should only be used to estimate a distribution over a single class variable. 10 options for feat1? Softmax of size 10. etc.



                losses = {"feat1_output": "categorical_crossentropy", "feat2_output": "categorical_crossentropy", "feat3_output": "categorical_crossentropy", "feat4_output": "categorical_crossentropy", "feat5_output": "categorical_crossentropy"}
                lossWeights = {"feat1_output": 1.0, "feat2_output": 1.0, ... , ...}# if everything is equal, dont worry about specifying loss weights.
                metrics = {"feat1_output": "categorical_accuracy", "feat2_output": "categorical_accuracy", "feat3_output": "categorical_accuracy", "feat4_output": "categorical_accuracy", "feat5_output": "categorical_accuracy"}
                opt = Adam(lr=init_lr,decay=init_lr / num_epochs)
                model.compile(loss = losses, loss_weights = lossWeights, optimizer=opt, metrics=metrics)


                Now, you will be optimizing 5 loss functions at the same time, one for each categorical prediction. You must now have 5 Y-variable datasets, each of size (num_samples, time_step_len, num_categories_of_feature). You will then give 5 y datasets for the outputs in the fit function, as a list. However, to properly name the output layers, you will need to specify the names for the output layers in the model definition.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered 1 hour ago









                kylec123kylec123

                416




                416






















                    ginevracoal is a new contributor. Be nice, and check out our Code of Conduct.










                    draft saved

                    draft discarded


















                    ginevracoal is a new contributor. Be nice, and check out our Code of Conduct.













                    ginevracoal is a new contributor. Be nice, and check out our Code of Conduct.












                    ginevracoal is a new contributor. Be nice, and check out our Code of Conduct.
















                    Thanks for contributing an answer to Data Science Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46006%2flstm-sequence-prediction-3d-input-to-2d-output%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    is 'sed' thread safeWhat should someone know about using Python scripts in the shell?Nexenta bash script uses...

                    How do i solve the “ No module named 'mlxtend' ” issue on Jupyter?

                    Pilgersdorf Inhaltsverzeichnis Geografie | Geschichte | Bevölkerungsentwicklung | Politik | Kultur...