Can I use statistics of original data as input features?How to select features from text data?What features...

How to count the characters of jar files by wc

Eww, those bytes are gross

Injecting creativity into a cookbook

Table formatting top left corner caption

Why is mind meld hard for T'pol in Star Trek: Enterprise?

Why do stocks necessarily drop during a recession?

Is it a fallacy if someone claims they need an explanation for every word of your argument to the point where they don't understand common terms?

Blindfold battle as a gladiatorial spectacle - what are the tactics and communication methods?

Why publish a research paper when a blog post or a lecture slide can have more citation count than a journal paper?

Caruana vs Carlsen game 10 (WCC) why not 18...Nxb6?

Cookies - Should the toggles be on?

How much mayhem could I cause as a sentient fish?

Can making a creature unable to attack after it has been assigned as an attacker remove it from combat?

How to prevent users from executing commands through browser URL

Incorporating research and background: How much is too much?

Does SQL Server 2017, including older versions, support 8k disk sector sizes?

Who is this Ant Woman character in this image alongside the Wasp?

Difference between `vector<int> v;` and `vector<int> v = vector<int>();`

Why is working on the same position for more than 15 years not a red flag?

How to remove lines through the legend markers in ListPlot?

Word or phrase for showing great skill at something WITHOUT formal training in it

Can an insurance company drop you after receiving a bill and refusing to pay?

Why do neural networks need so many training examples to perform?

Why isn't there a non-conducting core wire for high-frequency coil applications



Can I use statistics of original data as input features?


How to select features from text data?What features from sound waves to use for an AI song composer?Open source Anomaly Detection in PythonGenerating data that look alike my original dataFeature Transformation on Input dataThe automatic construction of new features from raw dataHow can I find anomalies in each row of data?Anomaly detection on time seriesDo anomalous input features to autoencoder result in high errors on the corresponding output features?Size of Output vector from AvgW2V Vectorizer is less than Size of Input data













0












$begingroup$


I've got a problem regarding defect analysis, where the goal is to improve the manufacturing process of a factory.
The factory does one or more batches of production everyday, each one divided by groups of pieces. We call each group in a batch a "spoon".
So, I've been given data of the number of errors and number of produced pieces in each batch, and then I have data of the production process at "spoon" level, and other data from other parts of the process (casts, ovens) which I can relate to the rest by the date and hour.



I do know which "spoons" belong to which batches, but I do not know the exact size of a "spoon" in terms of pieces.
Essentially, given that the output variable "proportion of error" is at batch level, I would need batch instances in order to develop a model using supervised learning.



Do you know if this is possible, via averaging or other statistical measures of the "spoons" that form a batch? Or do I have to proceed directly to unsupervised learning (clustering, outlier detection...)










share|improve this question







New contributor




user3897600 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$








  • 1




    $begingroup$
    It's not quite clear what the issue is here. You're predicting at the batch level but can use statistics about that batch as you like. Does the variable size of spoons matter? is it that some statistics about spoons vary with size, and you don't know the size and don't know if they're comparable?
    $endgroup$
    – Sean Owen
    13 hours ago
















0












$begingroup$


I've got a problem regarding defect analysis, where the goal is to improve the manufacturing process of a factory.
The factory does one or more batches of production everyday, each one divided by groups of pieces. We call each group in a batch a "spoon".
So, I've been given data of the number of errors and number of produced pieces in each batch, and then I have data of the production process at "spoon" level, and other data from other parts of the process (casts, ovens) which I can relate to the rest by the date and hour.



I do know which "spoons" belong to which batches, but I do not know the exact size of a "spoon" in terms of pieces.
Essentially, given that the output variable "proportion of error" is at batch level, I would need batch instances in order to develop a model using supervised learning.



Do you know if this is possible, via averaging or other statistical measures of the "spoons" that form a batch? Or do I have to proceed directly to unsupervised learning (clustering, outlier detection...)










share|improve this question







New contributor




user3897600 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$








  • 1




    $begingroup$
    It's not quite clear what the issue is here. You're predicting at the batch level but can use statistics about that batch as you like. Does the variable size of spoons matter? is it that some statistics about spoons vary with size, and you don't know the size and don't know if they're comparable?
    $endgroup$
    – Sean Owen
    13 hours ago














0












0








0





$begingroup$


I've got a problem regarding defect analysis, where the goal is to improve the manufacturing process of a factory.
The factory does one or more batches of production everyday, each one divided by groups of pieces. We call each group in a batch a "spoon".
So, I've been given data of the number of errors and number of produced pieces in each batch, and then I have data of the production process at "spoon" level, and other data from other parts of the process (casts, ovens) which I can relate to the rest by the date and hour.



I do know which "spoons" belong to which batches, but I do not know the exact size of a "spoon" in terms of pieces.
Essentially, given that the output variable "proportion of error" is at batch level, I would need batch instances in order to develop a model using supervised learning.



Do you know if this is possible, via averaging or other statistical measures of the "spoons" that form a batch? Or do I have to proceed directly to unsupervised learning (clustering, outlier detection...)










share|improve this question







New contributor




user3897600 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




I've got a problem regarding defect analysis, where the goal is to improve the manufacturing process of a factory.
The factory does one or more batches of production everyday, each one divided by groups of pieces. We call each group in a batch a "spoon".
So, I've been given data of the number of errors and number of produced pieces in each batch, and then I have data of the production process at "spoon" level, and other data from other parts of the process (casts, ovens) which I can relate to the rest by the date and hour.



I do know which "spoons" belong to which batches, but I do not know the exact size of a "spoon" in terms of pieces.
Essentially, given that the output variable "proportion of error" is at batch level, I would need batch instances in order to develop a model using supervised learning.



Do you know if this is possible, via averaging or other statistical measures of the "spoons" that form a batch? Or do I have to proceed directly to unsupervised learning (clustering, outlier detection...)







feature-extraction anomaly-detection






share|improve this question







New contributor




user3897600 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question







New contributor




user3897600 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question






New contributor




user3897600 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 15 hours ago









user3897600user3897600

1




1




New contributor




user3897600 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





user3897600 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






user3897600 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








  • 1




    $begingroup$
    It's not quite clear what the issue is here. You're predicting at the batch level but can use statistics about that batch as you like. Does the variable size of spoons matter? is it that some statistics about spoons vary with size, and you don't know the size and don't know if they're comparable?
    $endgroup$
    – Sean Owen
    13 hours ago














  • 1




    $begingroup$
    It's not quite clear what the issue is here. You're predicting at the batch level but can use statistics about that batch as you like. Does the variable size of spoons matter? is it that some statistics about spoons vary with size, and you don't know the size and don't know if they're comparable?
    $endgroup$
    – Sean Owen
    13 hours ago








1




1




$begingroup$
It's not quite clear what the issue is here. You're predicting at the batch level but can use statistics about that batch as you like. Does the variable size of spoons matter? is it that some statistics about spoons vary with size, and you don't know the size and don't know if they're comparable?
$endgroup$
– Sean Owen
13 hours ago




$begingroup$
It's not quite clear what the issue is here. You're predicting at the batch level but can use statistics about that batch as you like. Does the variable size of spoons matter? is it that some statistics about spoons vary with size, and you don't know the size and don't know if they're comparable?
$endgroup$
– Sean Owen
13 hours ago










0






active

oldest

votes











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});






user3897600 is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46403%2fcan-i-use-statistics-of-original-data-as-input-features%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes








user3897600 is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















user3897600 is a new contributor. Be nice, and check out our Code of Conduct.













user3897600 is a new contributor. Be nice, and check out our Code of Conduct.












user3897600 is a new contributor. Be nice, and check out our Code of Conduct.
















Thanks for contributing an answer to Data Science Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46403%2fcan-i-use-statistics-of-original-data-as-input-features%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown