Can I rely on these GitHub repository files?Which file encryption algorithm is used by Synology's Cloud Sync...

Proving by induction of n. Is this correct until this point?

Why isn't KTEX's runway designation 10/28 instead of 9/27?

Can somebody explain Brexit in a few child-proof sentences?

I2C signal and power over long range (10meter cable)

A workplace installs custom certificates on personal devices, can this be used to decrypt HTTPS traffic?

In Star Trek IV, why did the Bounty go back to a time when whales were already rare?

Can a Bard use an arcane focus?

Adding empty element to declared container without declaring type of element

Who must act to prevent Brexit on March 29th?

Can I use my Chinese passport to enter China after I acquired another citizenship?

How do I repair my stair bannister?

Can I create an upright 7-foot × 5-foot wall with the Minor Illusion spell?

Why does this part of the Space Shuttle launch pad seem to be floating in air?

How to prevent YouTube from showing already watched videos?

No idea how to draw this using tikz

Would it be legal for a US State to ban exports of a natural resource?

How will losing mobility of one hand affect my career as a programmer?

Is there enough fresh water in the world to eradicate the drinking water crisis?

Installing PowerShell on 32-bit Kali OS fails

What to do when my ideas aren't chosen, when I strongly disagree with the chosen solution?

Why are on-board computers allowed to change controls without notifying the pilots?

How to check participants in at events?

Golf game boilerplate

Identify a stage play about a VR experience in which participants are encouraged to simulate performing horrific activities



Can I rely on these GitHub repository files?


Which file encryption algorithm is used by Synology's Cloud Sync feature?GitHub pages and same originDoes GitHub have an endpoint for reading a users GPG keys?API credentials visible when creating Github pages website?Why host third party libs instead of relying on CDN, Nuget, GitHub?Making an API repository private vs publicHow does Github preserve versioning integrity?How does Github authentication work (command line, api)?Is it a good idea to upload your gnupg files to github?How could malicious code changes in a GitHub pull request be masked by an attacker?













17















I recently found the GitHub repository https://github.com/userEn1gm4/HLuna, but after I cloned it I noted that the comparison between the file compiled (using g++) from source, HLuna.cxx, and the binary included in the repository (HLuna) is different: differ: byte 25, line 1. Is the provided binary file secure?



I've already analyzed that in VirusTotal without any issues, but I don't have the expertise to decompile and read the output, and I've previously executed the binary provided without thinking about the risks.










share|improve this question









New contributor




mcruz2401 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
















  • 3





    If you're able to compile from source, then just use your computer version.

    – Daisetsu
    yesterday






  • 13





    It takes lots of effort for builds to be reproducible (deterministic) due to nature of legacy tools (because no one cared about that in past). Debian is trying to be deterministic since 2014, still not done :)

    – PTwr
    21 hours ago






  • 1





    There is a relevant post (full disclosure: mine) on OpenSource.SE with several helpful links about deterministic and non-deterministic builds: Is there any way to assert that source code corresponds to compiled code?

    – apsillers
    16 hours ago








  • 1





    How do you know you can trust the source code in the repo? Do you audit every single line of code? (the 175 line source code file you linked to is small enough that you can audit it, but if it were 10,000 or 100,000 lines of code, is the source code any safer than the published binaries?)

    – Johnny
    8 hours ago
















17















I recently found the GitHub repository https://github.com/userEn1gm4/HLuna, but after I cloned it I noted that the comparison between the file compiled (using g++) from source, HLuna.cxx, and the binary included in the repository (HLuna) is different: differ: byte 25, line 1. Is the provided binary file secure?



I've already analyzed that in VirusTotal without any issues, but I don't have the expertise to decompile and read the output, and I've previously executed the binary provided without thinking about the risks.










share|improve this question









New contributor




mcruz2401 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
















  • 3





    If you're able to compile from source, then just use your computer version.

    – Daisetsu
    yesterday






  • 13





    It takes lots of effort for builds to be reproducible (deterministic) due to nature of legacy tools (because no one cared about that in past). Debian is trying to be deterministic since 2014, still not done :)

    – PTwr
    21 hours ago






  • 1





    There is a relevant post (full disclosure: mine) on OpenSource.SE with several helpful links about deterministic and non-deterministic builds: Is there any way to assert that source code corresponds to compiled code?

    – apsillers
    16 hours ago








  • 1





    How do you know you can trust the source code in the repo? Do you audit every single line of code? (the 175 line source code file you linked to is small enough that you can audit it, but if it were 10,000 or 100,000 lines of code, is the source code any safer than the published binaries?)

    – Johnny
    8 hours ago














17












17








17


2






I recently found the GitHub repository https://github.com/userEn1gm4/HLuna, but after I cloned it I noted that the comparison between the file compiled (using g++) from source, HLuna.cxx, and the binary included in the repository (HLuna) is different: differ: byte 25, line 1. Is the provided binary file secure?



I've already analyzed that in VirusTotal without any issues, but I don't have the expertise to decompile and read the output, and I've previously executed the binary provided without thinking about the risks.










share|improve this question









New contributor




mcruz2401 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












I recently found the GitHub repository https://github.com/userEn1gm4/HLuna, but after I cloned it I noted that the comparison between the file compiled (using g++) from source, HLuna.cxx, and the binary included in the repository (HLuna) is different: differ: byte 25, line 1. Is the provided binary file secure?



I've already analyzed that in VirusTotal without any issues, but I don't have the expertise to decompile and read the output, and I've previously executed the binary provided without thinking about the risks.







reverse-engineering c++ github






share|improve this question









New contributor




mcruz2401 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




mcruz2401 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 22 hours ago









Peter Mortensen

70049




70049






New contributor




mcruz2401 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked yesterday









mcruz2401mcruz2401

9115




9115




New contributor




mcruz2401 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





mcruz2401 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






mcruz2401 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








  • 3





    If you're able to compile from source, then just use your computer version.

    – Daisetsu
    yesterday






  • 13





    It takes lots of effort for builds to be reproducible (deterministic) due to nature of legacy tools (because no one cared about that in past). Debian is trying to be deterministic since 2014, still not done :)

    – PTwr
    21 hours ago






  • 1





    There is a relevant post (full disclosure: mine) on OpenSource.SE with several helpful links about deterministic and non-deterministic builds: Is there any way to assert that source code corresponds to compiled code?

    – apsillers
    16 hours ago








  • 1





    How do you know you can trust the source code in the repo? Do you audit every single line of code? (the 175 line source code file you linked to is small enough that you can audit it, but if it were 10,000 or 100,000 lines of code, is the source code any safer than the published binaries?)

    – Johnny
    8 hours ago














  • 3





    If you're able to compile from source, then just use your computer version.

    – Daisetsu
    yesterday






  • 13





    It takes lots of effort for builds to be reproducible (deterministic) due to nature of legacy tools (because no one cared about that in past). Debian is trying to be deterministic since 2014, still not done :)

    – PTwr
    21 hours ago






  • 1





    There is a relevant post (full disclosure: mine) on OpenSource.SE with several helpful links about deterministic and non-deterministic builds: Is there any way to assert that source code corresponds to compiled code?

    – apsillers
    16 hours ago








  • 1





    How do you know you can trust the source code in the repo? Do you audit every single line of code? (the 175 line source code file you linked to is small enough that you can audit it, but if it were 10,000 or 100,000 lines of code, is the source code any safer than the published binaries?)

    – Johnny
    8 hours ago








3




3





If you're able to compile from source, then just use your computer version.

– Daisetsu
yesterday





If you're able to compile from source, then just use your computer version.

– Daisetsu
yesterday




13




13





It takes lots of effort for builds to be reproducible (deterministic) due to nature of legacy tools (because no one cared about that in past). Debian is trying to be deterministic since 2014, still not done :)

– PTwr
21 hours ago





It takes lots of effort for builds to be reproducible (deterministic) due to nature of legacy tools (because no one cared about that in past). Debian is trying to be deterministic since 2014, still not done :)

– PTwr
21 hours ago




1




1





There is a relevant post (full disclosure: mine) on OpenSource.SE with several helpful links about deterministic and non-deterministic builds: Is there any way to assert that source code corresponds to compiled code?

– apsillers
16 hours ago







There is a relevant post (full disclosure: mine) on OpenSource.SE with several helpful links about deterministic and non-deterministic builds: Is there any way to assert that source code corresponds to compiled code?

– apsillers
16 hours ago






1




1





How do you know you can trust the source code in the repo? Do you audit every single line of code? (the 175 line source code file you linked to is small enough that you can audit it, but if it were 10,000 or 100,000 lines of code, is the source code any safer than the published binaries?)

– Johnny
8 hours ago





How do you know you can trust the source code in the repo? Do you audit every single line of code? (the 175 line source code file you linked to is small enough that you can audit it, but if it were 10,000 or 100,000 lines of code, is the source code any safer than the published binaries?)

– Johnny
8 hours ago










3 Answers
3






active

oldest

votes


















17














Polynomial tells you what may happen, and how to solve it. Here I will illustrate it:



I ran both binaries through strings and diffed them. That enough shows some completely harmless differences, in particular, the compiler used:



GCC: (Debian 6.3.0-18) 6.3.0 20170516                         | GCC: (GNU) 8.2.1 20181105 (Red Hat 8.2.1-5)
> GCC: (GNU) 8.3.1 20190223 (Red Hat 8.3.1-2)
> gcc 8.2.1 20181105


Some of the private names used are also different:



_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEaSEOS4_@ | _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEaSERKS4_


And some sections seem to be shuffled, so the diff cannot match them exactly.



Even on the same computer, without optimisation and -O3 shows different files:



_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6appendE | _ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEED2Ev


Even shuffling of internal data:



Diccionario creado!                                           <
MENU <
1. Generador de Diccionarios <
0. Salir <
/*** <
* $$| |$$ |$$| <
* $$| |$$ |$$| * $$| |$$ |$$|
* $$| |$$ |$$| $$| |$$ |$$$$$$| |$$$$$$| * $$| |$$ |$$| $$| |$$ |$$$$$$| |$$$$$$|
* $$$$$$$$ |$$| $$| |$$ |$$ __ $$| ____$$| * $$$$$$$$ |$$| $$| |$$ |$$ __ $$| ____$$|
* $$| |$$ |$$| $$| |$$ |$$| |$$| $$$$$$$| * $$| |$$ |$$| $$| |$$ |$$| |$$| $$$$$$$|
* $$| |$$ |$$|___ $$|_|$$ |$$| |$$| $$___$$| * $$| |$$ |$$|___ $$|_|$$ |$$| |$$| $$___$$|
* $$| |$$ |$$$$$$$| $$$$$ |$$| |$$| $$$$$$$| * $$| |$$ |$$$$$$$| $$$$$ |$$| |$$| $$$$$$$|
* ---------------------------------------------- * ----------------------------------------------
> -------------------
> Diccionario creado!
> MENU
> 1. Generador de Diccionarios
> 0. Salir
> /***
> * $$| |$$ |$$|



This proves that differing binary files raises many false positives, and doesn't tell you anything about is safety.



In this case, I'd use the version compiled by myself because you have no way to know what version is uploaded, as the author may have forgotten to recompile before the last tweaks.






share|improve this answer








New contributor




Davidmh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
















  • 7





    I don't think those are different names - what's actually happened is that when the immediately adjoining data are printable, strings grabs slightly more text. nm might be a better tool for extracting identifiers.

    – Toby Speight
    13 hours ago











  • @TobySpeight good point, I shall investigate and correct.

    – Davidmh
    7 hours ago











  • …and even a honest author might be unknowingly infected by some malware.

    – spectras
    2 hours ago



















51














Compilation is not a directly verifiable deterministic process across compiler versions, library versions, operating systems, or a number of other different variables. The only way to verify is to perform a diff at the assembly level. There are lots of tools that can do this but you still need to put the manual work in.






share|improve this answer



















  • 31





    Even that isn't going to be reliable across optimization levels.

    – chrylis
    yesterday






  • 39





    Even if the compiled object code is 100% identical, there may still be timestamps in the executable file's metadata which cause the resulting binaries to differ even though the code is identical.

    – Jörg W Mittag
    22 hours ago






  • 1





    Reproducible builds solve this problem.

    – forest
    21 hours ago



















1














If the software is exactly the same at source level, then the question boils down to whether you can trust your compiler, system libraries and various utilities which are used during compilation. If you installed your toolchain from a trusted source and you trust your computer wasn't compromised meanwhile, then there's no reason to suspect that the binary file that you generated will be malicious, even if it differs from the "reference" build.






share|improve this answer



















  • 3





    Of course, Ken Thompson may disagree.

    – Jörg W Mittag
    13 hours ago






  • 1





    @JörgWMittag If you can't trust trust, who can you trust?

    – apsillers
    12 hours ago











Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "162"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});






mcruz2401 is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsecurity.stackexchange.com%2fquestions%2f206000%2fcan-i-rely-on-these-github-repository-files%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























3 Answers
3






active

oldest

votes








3 Answers
3






active

oldest

votes









active

oldest

votes






active

oldest

votes









17














Polynomial tells you what may happen, and how to solve it. Here I will illustrate it:



I ran both binaries through strings and diffed them. That enough shows some completely harmless differences, in particular, the compiler used:



GCC: (Debian 6.3.0-18) 6.3.0 20170516                         | GCC: (GNU) 8.2.1 20181105 (Red Hat 8.2.1-5)
> GCC: (GNU) 8.3.1 20190223 (Red Hat 8.3.1-2)
> gcc 8.2.1 20181105


Some of the private names used are also different:



_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEaSEOS4_@ | _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEaSERKS4_


And some sections seem to be shuffled, so the diff cannot match them exactly.



Even on the same computer, without optimisation and -O3 shows different files:



_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6appendE | _ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEED2Ev


Even shuffling of internal data:



Diccionario creado!                                           <
MENU <
1. Generador de Diccionarios <
0. Salir <
/*** <
* $$| |$$ |$$| <
* $$| |$$ |$$| * $$| |$$ |$$|
* $$| |$$ |$$| $$| |$$ |$$$$$$| |$$$$$$| * $$| |$$ |$$| $$| |$$ |$$$$$$| |$$$$$$|
* $$$$$$$$ |$$| $$| |$$ |$$ __ $$| ____$$| * $$$$$$$$ |$$| $$| |$$ |$$ __ $$| ____$$|
* $$| |$$ |$$| $$| |$$ |$$| |$$| $$$$$$$| * $$| |$$ |$$| $$| |$$ |$$| |$$| $$$$$$$|
* $$| |$$ |$$|___ $$|_|$$ |$$| |$$| $$___$$| * $$| |$$ |$$|___ $$|_|$$ |$$| |$$| $$___$$|
* $$| |$$ |$$$$$$$| $$$$$ |$$| |$$| $$$$$$$| * $$| |$$ |$$$$$$$| $$$$$ |$$| |$$| $$$$$$$|
* ---------------------------------------------- * ----------------------------------------------
> -------------------
> Diccionario creado!
> MENU
> 1. Generador de Diccionarios
> 0. Salir
> /***
> * $$| |$$ |$$|



This proves that differing binary files raises many false positives, and doesn't tell you anything about is safety.



In this case, I'd use the version compiled by myself because you have no way to know what version is uploaded, as the author may have forgotten to recompile before the last tweaks.






share|improve this answer








New contributor




Davidmh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
















  • 7





    I don't think those are different names - what's actually happened is that when the immediately adjoining data are printable, strings grabs slightly more text. nm might be a better tool for extracting identifiers.

    – Toby Speight
    13 hours ago











  • @TobySpeight good point, I shall investigate and correct.

    – Davidmh
    7 hours ago











  • …and even a honest author might be unknowingly infected by some malware.

    – spectras
    2 hours ago
















17














Polynomial tells you what may happen, and how to solve it. Here I will illustrate it:



I ran both binaries through strings and diffed them. That enough shows some completely harmless differences, in particular, the compiler used:



GCC: (Debian 6.3.0-18) 6.3.0 20170516                         | GCC: (GNU) 8.2.1 20181105 (Red Hat 8.2.1-5)
> GCC: (GNU) 8.3.1 20190223 (Red Hat 8.3.1-2)
> gcc 8.2.1 20181105


Some of the private names used are also different:



_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEaSEOS4_@ | _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEaSERKS4_


And some sections seem to be shuffled, so the diff cannot match them exactly.



Even on the same computer, without optimisation and -O3 shows different files:



_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6appendE | _ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEED2Ev


Even shuffling of internal data:



Diccionario creado!                                           <
MENU <
1. Generador de Diccionarios <
0. Salir <
/*** <
* $$| |$$ |$$| <
* $$| |$$ |$$| * $$| |$$ |$$|
* $$| |$$ |$$| $$| |$$ |$$$$$$| |$$$$$$| * $$| |$$ |$$| $$| |$$ |$$$$$$| |$$$$$$|
* $$$$$$$$ |$$| $$| |$$ |$$ __ $$| ____$$| * $$$$$$$$ |$$| $$| |$$ |$$ __ $$| ____$$|
* $$| |$$ |$$| $$| |$$ |$$| |$$| $$$$$$$| * $$| |$$ |$$| $$| |$$ |$$| |$$| $$$$$$$|
* $$| |$$ |$$|___ $$|_|$$ |$$| |$$| $$___$$| * $$| |$$ |$$|___ $$|_|$$ |$$| |$$| $$___$$|
* $$| |$$ |$$$$$$$| $$$$$ |$$| |$$| $$$$$$$| * $$| |$$ |$$$$$$$| $$$$$ |$$| |$$| $$$$$$$|
* ---------------------------------------------- * ----------------------------------------------
> -------------------
> Diccionario creado!
> MENU
> 1. Generador de Diccionarios
> 0. Salir
> /***
> * $$| |$$ |$$|



This proves that differing binary files raises many false positives, and doesn't tell you anything about is safety.



In this case, I'd use the version compiled by myself because you have no way to know what version is uploaded, as the author may have forgotten to recompile before the last tweaks.






share|improve this answer








New contributor




Davidmh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
















  • 7





    I don't think those are different names - what's actually happened is that when the immediately adjoining data are printable, strings grabs slightly more text. nm might be a better tool for extracting identifiers.

    – Toby Speight
    13 hours ago











  • @TobySpeight good point, I shall investigate and correct.

    – Davidmh
    7 hours ago











  • …and even a honest author might be unknowingly infected by some malware.

    – spectras
    2 hours ago














17












17








17







Polynomial tells you what may happen, and how to solve it. Here I will illustrate it:



I ran both binaries through strings and diffed them. That enough shows some completely harmless differences, in particular, the compiler used:



GCC: (Debian 6.3.0-18) 6.3.0 20170516                         | GCC: (GNU) 8.2.1 20181105 (Red Hat 8.2.1-5)
> GCC: (GNU) 8.3.1 20190223 (Red Hat 8.3.1-2)
> gcc 8.2.1 20181105


Some of the private names used are also different:



_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEaSEOS4_@ | _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEaSERKS4_


And some sections seem to be shuffled, so the diff cannot match them exactly.



Even on the same computer, without optimisation and -O3 shows different files:



_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6appendE | _ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEED2Ev


Even shuffling of internal data:



Diccionario creado!                                           <
MENU <
1. Generador de Diccionarios <
0. Salir <
/*** <
* $$| |$$ |$$| <
* $$| |$$ |$$| * $$| |$$ |$$|
* $$| |$$ |$$| $$| |$$ |$$$$$$| |$$$$$$| * $$| |$$ |$$| $$| |$$ |$$$$$$| |$$$$$$|
* $$$$$$$$ |$$| $$| |$$ |$$ __ $$| ____$$| * $$$$$$$$ |$$| $$| |$$ |$$ __ $$| ____$$|
* $$| |$$ |$$| $$| |$$ |$$| |$$| $$$$$$$| * $$| |$$ |$$| $$| |$$ |$$| |$$| $$$$$$$|
* $$| |$$ |$$|___ $$|_|$$ |$$| |$$| $$___$$| * $$| |$$ |$$|___ $$|_|$$ |$$| |$$| $$___$$|
* $$| |$$ |$$$$$$$| $$$$$ |$$| |$$| $$$$$$$| * $$| |$$ |$$$$$$$| $$$$$ |$$| |$$| $$$$$$$|
* ---------------------------------------------- * ----------------------------------------------
> -------------------
> Diccionario creado!
> MENU
> 1. Generador de Diccionarios
> 0. Salir
> /***
> * $$| |$$ |$$|



This proves that differing binary files raises many false positives, and doesn't tell you anything about is safety.



In this case, I'd use the version compiled by myself because you have no way to know what version is uploaded, as the author may have forgotten to recompile before the last tweaks.






share|improve this answer








New contributor




Davidmh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.










Polynomial tells you what may happen, and how to solve it. Here I will illustrate it:



I ran both binaries through strings and diffed them. That enough shows some completely harmless differences, in particular, the compiler used:



GCC: (Debian 6.3.0-18) 6.3.0 20170516                         | GCC: (GNU) 8.2.1 20181105 (Red Hat 8.2.1-5)
> GCC: (GNU) 8.3.1 20190223 (Red Hat 8.3.1-2)
> gcc 8.2.1 20181105


Some of the private names used are also different:



_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEaSEOS4_@ | _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEaSERKS4_


And some sections seem to be shuffled, so the diff cannot match them exactly.



Even on the same computer, without optimisation and -O3 shows different files:



_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6appendE | _ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEED2Ev


Even shuffling of internal data:



Diccionario creado!                                           <
MENU <
1. Generador de Diccionarios <
0. Salir <
/*** <
* $$| |$$ |$$| <
* $$| |$$ |$$| * $$| |$$ |$$|
* $$| |$$ |$$| $$| |$$ |$$$$$$| |$$$$$$| * $$| |$$ |$$| $$| |$$ |$$$$$$| |$$$$$$|
* $$$$$$$$ |$$| $$| |$$ |$$ __ $$| ____$$| * $$$$$$$$ |$$| $$| |$$ |$$ __ $$| ____$$|
* $$| |$$ |$$| $$| |$$ |$$| |$$| $$$$$$$| * $$| |$$ |$$| $$| |$$ |$$| |$$| $$$$$$$|
* $$| |$$ |$$|___ $$|_|$$ |$$| |$$| $$___$$| * $$| |$$ |$$|___ $$|_|$$ |$$| |$$| $$___$$|
* $$| |$$ |$$$$$$$| $$$$$ |$$| |$$| $$$$$$$| * $$| |$$ |$$$$$$$| $$$$$ |$$| |$$| $$$$$$$|
* ---------------------------------------------- * ----------------------------------------------
> -------------------
> Diccionario creado!
> MENU
> 1. Generador de Diccionarios
> 0. Salir
> /***
> * $$| |$$ |$$|



This proves that differing binary files raises many false positives, and doesn't tell you anything about is safety.



In this case, I'd use the version compiled by myself because you have no way to know what version is uploaded, as the author may have forgotten to recompile before the last tweaks.







share|improve this answer








New contributor




Davidmh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this answer



share|improve this answer






New contributor




Davidmh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









answered 21 hours ago









DavidmhDavidmh

28615




28615




New contributor




Davidmh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Davidmh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Davidmh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








  • 7





    I don't think those are different names - what's actually happened is that when the immediately adjoining data are printable, strings grabs slightly more text. nm might be a better tool for extracting identifiers.

    – Toby Speight
    13 hours ago











  • @TobySpeight good point, I shall investigate and correct.

    – Davidmh
    7 hours ago











  • …and even a honest author might be unknowingly infected by some malware.

    – spectras
    2 hours ago














  • 7





    I don't think those are different names - what's actually happened is that when the immediately adjoining data are printable, strings grabs slightly more text. nm might be a better tool for extracting identifiers.

    – Toby Speight
    13 hours ago











  • @TobySpeight good point, I shall investigate and correct.

    – Davidmh
    7 hours ago











  • …and even a honest author might be unknowingly infected by some malware.

    – spectras
    2 hours ago








7




7





I don't think those are different names - what's actually happened is that when the immediately adjoining data are printable, strings grabs slightly more text. nm might be a better tool for extracting identifiers.

– Toby Speight
13 hours ago





I don't think those are different names - what's actually happened is that when the immediately adjoining data are printable, strings grabs slightly more text. nm might be a better tool for extracting identifiers.

– Toby Speight
13 hours ago













@TobySpeight good point, I shall investigate and correct.

– Davidmh
7 hours ago





@TobySpeight good point, I shall investigate and correct.

– Davidmh
7 hours ago













…and even a honest author might be unknowingly infected by some malware.

– spectras
2 hours ago





…and even a honest author might be unknowingly infected by some malware.

– spectras
2 hours ago













51














Compilation is not a directly verifiable deterministic process across compiler versions, library versions, operating systems, or a number of other different variables. The only way to verify is to perform a diff at the assembly level. There are lots of tools that can do this but you still need to put the manual work in.






share|improve this answer



















  • 31





    Even that isn't going to be reliable across optimization levels.

    – chrylis
    yesterday






  • 39





    Even if the compiled object code is 100% identical, there may still be timestamps in the executable file's metadata which cause the resulting binaries to differ even though the code is identical.

    – Jörg W Mittag
    22 hours ago






  • 1





    Reproducible builds solve this problem.

    – forest
    21 hours ago
















51














Compilation is not a directly verifiable deterministic process across compiler versions, library versions, operating systems, or a number of other different variables. The only way to verify is to perform a diff at the assembly level. There are lots of tools that can do this but you still need to put the manual work in.






share|improve this answer



















  • 31





    Even that isn't going to be reliable across optimization levels.

    – chrylis
    yesterday






  • 39





    Even if the compiled object code is 100% identical, there may still be timestamps in the executable file's metadata which cause the resulting binaries to differ even though the code is identical.

    – Jörg W Mittag
    22 hours ago






  • 1





    Reproducible builds solve this problem.

    – forest
    21 hours ago














51












51








51







Compilation is not a directly verifiable deterministic process across compiler versions, library versions, operating systems, or a number of other different variables. The only way to verify is to perform a diff at the assembly level. There are lots of tools that can do this but you still need to put the manual work in.






share|improve this answer













Compilation is not a directly verifiable deterministic process across compiler versions, library versions, operating systems, or a number of other different variables. The only way to verify is to perform a diff at the assembly level. There are lots of tools that can do this but you still need to put the manual work in.







share|improve this answer












share|improve this answer



share|improve this answer










answered yesterday









PolynomialPolynomial

101k32249342




101k32249342








  • 31





    Even that isn't going to be reliable across optimization levels.

    – chrylis
    yesterday






  • 39





    Even if the compiled object code is 100% identical, there may still be timestamps in the executable file's metadata which cause the resulting binaries to differ even though the code is identical.

    – Jörg W Mittag
    22 hours ago






  • 1





    Reproducible builds solve this problem.

    – forest
    21 hours ago














  • 31





    Even that isn't going to be reliable across optimization levels.

    – chrylis
    yesterday






  • 39





    Even if the compiled object code is 100% identical, there may still be timestamps in the executable file's metadata which cause the resulting binaries to differ even though the code is identical.

    – Jörg W Mittag
    22 hours ago






  • 1





    Reproducible builds solve this problem.

    – forest
    21 hours ago








31




31





Even that isn't going to be reliable across optimization levels.

– chrylis
yesterday





Even that isn't going to be reliable across optimization levels.

– chrylis
yesterday




39




39





Even if the compiled object code is 100% identical, there may still be timestamps in the executable file's metadata which cause the resulting binaries to differ even though the code is identical.

– Jörg W Mittag
22 hours ago





Even if the compiled object code is 100% identical, there may still be timestamps in the executable file's metadata which cause the resulting binaries to differ even though the code is identical.

– Jörg W Mittag
22 hours ago




1




1





Reproducible builds solve this problem.

– forest
21 hours ago





Reproducible builds solve this problem.

– forest
21 hours ago











1














If the software is exactly the same at source level, then the question boils down to whether you can trust your compiler, system libraries and various utilities which are used during compilation. If you installed your toolchain from a trusted source and you trust your computer wasn't compromised meanwhile, then there's no reason to suspect that the binary file that you generated will be malicious, even if it differs from the "reference" build.






share|improve this answer



















  • 3





    Of course, Ken Thompson may disagree.

    – Jörg W Mittag
    13 hours ago






  • 1





    @JörgWMittag If you can't trust trust, who can you trust?

    – apsillers
    12 hours ago
















1














If the software is exactly the same at source level, then the question boils down to whether you can trust your compiler, system libraries and various utilities which are used during compilation. If you installed your toolchain from a trusted source and you trust your computer wasn't compromised meanwhile, then there's no reason to suspect that the binary file that you generated will be malicious, even if it differs from the "reference" build.






share|improve this answer



















  • 3





    Of course, Ken Thompson may disagree.

    – Jörg W Mittag
    13 hours ago






  • 1





    @JörgWMittag If you can't trust trust, who can you trust?

    – apsillers
    12 hours ago














1












1








1







If the software is exactly the same at source level, then the question boils down to whether you can trust your compiler, system libraries and various utilities which are used during compilation. If you installed your toolchain from a trusted source and you trust your computer wasn't compromised meanwhile, then there's no reason to suspect that the binary file that you generated will be malicious, even if it differs from the "reference" build.






share|improve this answer













If the software is exactly the same at source level, then the question boils down to whether you can trust your compiler, system libraries and various utilities which are used during compilation. If you installed your toolchain from a trusted source and you trust your computer wasn't compromised meanwhile, then there's no reason to suspect that the binary file that you generated will be malicious, even if it differs from the "reference" build.







share|improve this answer












share|improve this answer



share|improve this answer










answered 16 hours ago









Dmitry GrigoryevDmitry Grigoryev

7,6462144




7,6462144








  • 3





    Of course, Ken Thompson may disagree.

    – Jörg W Mittag
    13 hours ago






  • 1





    @JörgWMittag If you can't trust trust, who can you trust?

    – apsillers
    12 hours ago














  • 3





    Of course, Ken Thompson may disagree.

    – Jörg W Mittag
    13 hours ago






  • 1





    @JörgWMittag If you can't trust trust, who can you trust?

    – apsillers
    12 hours ago








3




3





Of course, Ken Thompson may disagree.

– Jörg W Mittag
13 hours ago





Of course, Ken Thompson may disagree.

– Jörg W Mittag
13 hours ago




1




1





@JörgWMittag If you can't trust trust, who can you trust?

– apsillers
12 hours ago





@JörgWMittag If you can't trust trust, who can you trust?

– apsillers
12 hours ago










mcruz2401 is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















mcruz2401 is a new contributor. Be nice, and check out our Code of Conduct.













mcruz2401 is a new contributor. Be nice, and check out our Code of Conduct.












mcruz2401 is a new contributor. Be nice, and check out our Code of Conduct.
















Thanks for contributing an answer to Information Security Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsecurity.stackexchange.com%2fquestions%2f206000%2fcan-i-rely-on-these-github-repository-files%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

is 'sed' thread safeWhat should someone know about using Python scripts in the shell?Nexenta bash script uses...

How do i solve the “ No module named 'mlxtend' ” issue on Jupyter?

Pilgersdorf Inhaltsverzeichnis Geografie | Geschichte | Bevölkerungsentwicklung | Politik | Kultur...