Script that fixes YML SIDs to be simplified for productionReading an input file with 6 columnsopening a text...

Failed to fetch jessie backports repository

What would be the benefits of having both a state and local currencies?

Everything Bob says is false. How does he get people to trust him?

Finding all intervals that match predicate in vector

How do I keep an essay about "feeling flat" from feeling flat?

How can a jailer prevent the Forge Cleric's Artisan's Blessing from being used?

HashMap containsKey() returns false although hashCode() and equals() are true

What to do with wrong results in talks?

Why did Kant, Hegel, and Adorno leave some words and phrases in the Greek alphabet?

How can I replace every global instance of "x[2]" with "x_2"

Is it correct to write "is not focus on"?

Why is delta-v is the most useful quantity for planning space travel?

How does residential electricity work?

Opposite of a diet

The plural of 'stomach"

I'm in charge of equipment buying but no one's ever happy with what I choose. How to fix this?

Print name if parameter passed to function

Bash method for viewing beginning and end of file

Best way to store options for panels

Efficiently merge handle parallel feature branches in SFDX

Coordinate position not precise

How to be diplomatic in refusing to write code that breaches the privacy of our users

How can I get through very long and very dry, but also very useful technical documents when learning a new tool?

How could Frankenstein get the parts for his _second_ creature?



Script that fixes YML SIDs to be simplified for production


Reading an input file with 6 columnsopening a text file and building a dictionaryProcessing large file in PythonURL pattern matchingInstantly filtering a list on user input and navigate with arrow keys/tabOptimize text search in files with BashSorting a long string with a composite of strings and integers + symbolsShell script to manage passwordsCode to implement the Jaro similarity for fuzzy matching stringsGitHub repo tree generator













0












$begingroup$


I work with YML files that include SIDs and the following structure:



    title:
"2": "content a" # key: comment
"3": "content b" # key: comment
"4": "content c" # key: comment
"5": "content d" # key: comment
"6": "content e" # key: comment


Usually, I have to remove some strings (note I never remove the number 1 or 2) so my new file looks like this:



    title:
"2": "content a" # key: comment
"3": "content b" # key: comment
"5": "content d" # key: comment
"6": "content e" # key: comment


I need to rearrange the SIDs in order to have a sequence without any gap (in this case 2, 3, 4, 5, 6) independently on the content. For that reason I have written the following script. It works properly but I need to bring it into production so I need your help to reduce its complexity, make it clear and simpler or any advice you may have for a beginner (in both, Python and Stack Exchange).



import re, os

file=input ('YML file name: ')

#read the file and store its content as a list
os.chdir('/home/balaclava/Desktop/Scripts/YML fixer/')
rdfile= open(file)
cont=rdfile.readlines()
rdfile.close()

#list to store the reviewed strings
newfile=[]
newfile.append(cont[0]+cont[1])

#Get the second string SID as reference
numRegex = re.compile(r'd+')
act=numRegex.search(cont[1])
global refnum
refnum=int(act.group())

#Loop for each string (-2 due to the two first string are excluded)
for i in range(len(cont)-2):
act=numRegex.search(str(cont[i+2]))
temp=int(act.group())
#If the SID is correct, add item to newlist, else, fix it and add the item to the list.
if temp == (refnum+1):
newfile.append(cont[i+2])
else:
temp= (refnum+1)
change=numRegex.sub(str(temp), cont[i+2])
newfile.append(change)
refnum += 1

#overwrite the file with the newlist content
with open (file,'w') as finalfile:
finalfile.write(''.join(newfile))
finalfile.close()









share|improve this question











$endgroup$








  • 1




    $begingroup$
    Instead of attacking this with regexes, why not just do a yaml.load to bring it in, and modify it using regular python operations on the resulting data structure? It would be clear, and there would be no risk of corrupting the format.
    $endgroup$
    – Austin Hastings
    Mar 15 at 4:01










  • $begingroup$
    @AustinHastings, yaml.load would not keep the comments.
    $endgroup$
    – Balaclava
    Mar 17 at 17:52


















0












$begingroup$


I work with YML files that include SIDs and the following structure:



    title:
"2": "content a" # key: comment
"3": "content b" # key: comment
"4": "content c" # key: comment
"5": "content d" # key: comment
"6": "content e" # key: comment


Usually, I have to remove some strings (note I never remove the number 1 or 2) so my new file looks like this:



    title:
"2": "content a" # key: comment
"3": "content b" # key: comment
"5": "content d" # key: comment
"6": "content e" # key: comment


I need to rearrange the SIDs in order to have a sequence without any gap (in this case 2, 3, 4, 5, 6) independently on the content. For that reason I have written the following script. It works properly but I need to bring it into production so I need your help to reduce its complexity, make it clear and simpler or any advice you may have for a beginner (in both, Python and Stack Exchange).



import re, os

file=input ('YML file name: ')

#read the file and store its content as a list
os.chdir('/home/balaclava/Desktop/Scripts/YML fixer/')
rdfile= open(file)
cont=rdfile.readlines()
rdfile.close()

#list to store the reviewed strings
newfile=[]
newfile.append(cont[0]+cont[1])

#Get the second string SID as reference
numRegex = re.compile(r'd+')
act=numRegex.search(cont[1])
global refnum
refnum=int(act.group())

#Loop for each string (-2 due to the two first string are excluded)
for i in range(len(cont)-2):
act=numRegex.search(str(cont[i+2]))
temp=int(act.group())
#If the SID is correct, add item to newlist, else, fix it and add the item to the list.
if temp == (refnum+1):
newfile.append(cont[i+2])
else:
temp= (refnum+1)
change=numRegex.sub(str(temp), cont[i+2])
newfile.append(change)
refnum += 1

#overwrite the file with the newlist content
with open (file,'w') as finalfile:
finalfile.write(''.join(newfile))
finalfile.close()









share|improve this question











$endgroup$








  • 1




    $begingroup$
    Instead of attacking this with regexes, why not just do a yaml.load to bring it in, and modify it using regular python operations on the resulting data structure? It would be clear, and there would be no risk of corrupting the format.
    $endgroup$
    – Austin Hastings
    Mar 15 at 4:01










  • $begingroup$
    @AustinHastings, yaml.load would not keep the comments.
    $endgroup$
    – Balaclava
    Mar 17 at 17:52
















0












0








0





$begingroup$


I work with YML files that include SIDs and the following structure:



    title:
"2": "content a" # key: comment
"3": "content b" # key: comment
"4": "content c" # key: comment
"5": "content d" # key: comment
"6": "content e" # key: comment


Usually, I have to remove some strings (note I never remove the number 1 or 2) so my new file looks like this:



    title:
"2": "content a" # key: comment
"3": "content b" # key: comment
"5": "content d" # key: comment
"6": "content e" # key: comment


I need to rearrange the SIDs in order to have a sequence without any gap (in this case 2, 3, 4, 5, 6) independently on the content. For that reason I have written the following script. It works properly but I need to bring it into production so I need your help to reduce its complexity, make it clear and simpler or any advice you may have for a beginner (in both, Python and Stack Exchange).



import re, os

file=input ('YML file name: ')

#read the file and store its content as a list
os.chdir('/home/balaclava/Desktop/Scripts/YML fixer/')
rdfile= open(file)
cont=rdfile.readlines()
rdfile.close()

#list to store the reviewed strings
newfile=[]
newfile.append(cont[0]+cont[1])

#Get the second string SID as reference
numRegex = re.compile(r'd+')
act=numRegex.search(cont[1])
global refnum
refnum=int(act.group())

#Loop for each string (-2 due to the two first string are excluded)
for i in range(len(cont)-2):
act=numRegex.search(str(cont[i+2]))
temp=int(act.group())
#If the SID is correct, add item to newlist, else, fix it and add the item to the list.
if temp == (refnum+1):
newfile.append(cont[i+2])
else:
temp= (refnum+1)
change=numRegex.sub(str(temp), cont[i+2])
newfile.append(change)
refnum += 1

#overwrite the file with the newlist content
with open (file,'w') as finalfile:
finalfile.write(''.join(newfile))
finalfile.close()









share|improve this question











$endgroup$




I work with YML files that include SIDs and the following structure:



    title:
"2": "content a" # key: comment
"3": "content b" # key: comment
"4": "content c" # key: comment
"5": "content d" # key: comment
"6": "content e" # key: comment


Usually, I have to remove some strings (note I never remove the number 1 or 2) so my new file looks like this:



    title:
"2": "content a" # key: comment
"3": "content b" # key: comment
"5": "content d" # key: comment
"6": "content e" # key: comment


I need to rearrange the SIDs in order to have a sequence without any gap (in this case 2, 3, 4, 5, 6) independently on the content. For that reason I have written the following script. It works properly but I need to bring it into production so I need your help to reduce its complexity, make it clear and simpler or any advice you may have for a beginner (in both, Python and Stack Exchange).



import re, os

file=input ('YML file name: ')

#read the file and store its content as a list
os.chdir('/home/balaclava/Desktop/Scripts/YML fixer/')
rdfile= open(file)
cont=rdfile.readlines()
rdfile.close()

#list to store the reviewed strings
newfile=[]
newfile.append(cont[0]+cont[1])

#Get the second string SID as reference
numRegex = re.compile(r'd+')
act=numRegex.search(cont[1])
global refnum
refnum=int(act.group())

#Loop for each string (-2 due to the two first string are excluded)
for i in range(len(cont)-2):
act=numRegex.search(str(cont[i+2]))
temp=int(act.group())
#If the SID is correct, add item to newlist, else, fix it and add the item to the list.
if temp == (refnum+1):
newfile.append(cont[i+2])
else:
temp= (refnum+1)
change=numRegex.sub(str(temp), cont[i+2])
newfile.append(change)
refnum += 1

#overwrite the file with the newlist content
with open (file,'w') as finalfile:
finalfile.write(''.join(newfile))
finalfile.close()






python beginner yaml






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 8 mins ago







Balaclava

















asked Mar 15 at 0:10









BalaclavaBalaclava

32




32








  • 1




    $begingroup$
    Instead of attacking this with regexes, why not just do a yaml.load to bring it in, and modify it using regular python operations on the resulting data structure? It would be clear, and there would be no risk of corrupting the format.
    $endgroup$
    – Austin Hastings
    Mar 15 at 4:01










  • $begingroup$
    @AustinHastings, yaml.load would not keep the comments.
    $endgroup$
    – Balaclava
    Mar 17 at 17:52
















  • 1




    $begingroup$
    Instead of attacking this with regexes, why not just do a yaml.load to bring it in, and modify it using regular python operations on the resulting data structure? It would be clear, and there would be no risk of corrupting the format.
    $endgroup$
    – Austin Hastings
    Mar 15 at 4:01










  • $begingroup$
    @AustinHastings, yaml.load would not keep the comments.
    $endgroup$
    – Balaclava
    Mar 17 at 17:52










1




1




$begingroup$
Instead of attacking this with regexes, why not just do a yaml.load to bring it in, and modify it using regular python operations on the resulting data structure? It would be clear, and there would be no risk of corrupting the format.
$endgroup$
– Austin Hastings
Mar 15 at 4:01




$begingroup$
Instead of attacking this with regexes, why not just do a yaml.load to bring it in, and modify it using regular python operations on the resulting data structure? It would be clear, and there would be no risk of corrupting the format.
$endgroup$
– Austin Hastings
Mar 15 at 4:01












$begingroup$
@AustinHastings, yaml.load would not keep the comments.
$endgroup$
– Balaclava
Mar 17 at 17:52






$begingroup$
@AustinHastings, yaml.load would not keep the comments.
$endgroup$
– Balaclava
Mar 17 at 17:52












1 Answer
1






active

oldest

votes


















0












$begingroup$

You could use rumael.yaml, it can preserve comments. https://stackoverflow.com/questions/7255885/save-dump-a-yaml-file-with-comments-in-pyyaml#27103244



Moreover, you want to be a better python developer (or maybe pythonist?) I can give you some tips:



Content duplication
You are storing the file content inside cont and after closing the file you are duplicating that info in a new variable newfile I think that is an unnecessary process in this situation. You could store all the data in cont and just modify the lines needed. You can replace the entire if-else by:



    if temp != (refnum+1):              
temp= (refnum+1)
change=numRegex.sub(str(temp), cont[i])
cont[i] = change


For loop's range
Change your range call in the for loop to range(2, len(cont)):
Now inside the loop you can access the current line with simply cont[i] it's more readable and efficient.



As with i's range you are accessing refnum always with a +1. By initializing it as refnum=int(act.group()) +1 your code saves that operations inside the loop. Another thing that you can do is do the +=1 increment at the beginning of the loop.



File management
You don't need to manually close files when using with statement you can remove finalfile.close(). Another thing, you are using with in the when writing but not when reading, think about always use the same method.



More things can be changed but I think that's enough for now.






share|improve this answer










New contributor




Gardo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$













    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "196"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215464%2fscript-that-fixes-yml-sids-to-be-simplified-for-production%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0












    $begingroup$

    You could use rumael.yaml, it can preserve comments. https://stackoverflow.com/questions/7255885/save-dump-a-yaml-file-with-comments-in-pyyaml#27103244



    Moreover, you want to be a better python developer (or maybe pythonist?) I can give you some tips:



    Content duplication
    You are storing the file content inside cont and after closing the file you are duplicating that info in a new variable newfile I think that is an unnecessary process in this situation. You could store all the data in cont and just modify the lines needed. You can replace the entire if-else by:



        if temp != (refnum+1):              
    temp= (refnum+1)
    change=numRegex.sub(str(temp), cont[i])
    cont[i] = change


    For loop's range
    Change your range call in the for loop to range(2, len(cont)):
    Now inside the loop you can access the current line with simply cont[i] it's more readable and efficient.



    As with i's range you are accessing refnum always with a +1. By initializing it as refnum=int(act.group()) +1 your code saves that operations inside the loop. Another thing that you can do is do the +=1 increment at the beginning of the loop.



    File management
    You don't need to manually close files when using with statement you can remove finalfile.close(). Another thing, you are using with in the when writing but not when reading, think about always use the same method.



    More things can be changed but I think that's enough for now.






    share|improve this answer










    New contributor




    Gardo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






    $endgroup$


















      0












      $begingroup$

      You could use rumael.yaml, it can preserve comments. https://stackoverflow.com/questions/7255885/save-dump-a-yaml-file-with-comments-in-pyyaml#27103244



      Moreover, you want to be a better python developer (or maybe pythonist?) I can give you some tips:



      Content duplication
      You are storing the file content inside cont and after closing the file you are duplicating that info in a new variable newfile I think that is an unnecessary process in this situation. You could store all the data in cont and just modify the lines needed. You can replace the entire if-else by:



          if temp != (refnum+1):              
      temp= (refnum+1)
      change=numRegex.sub(str(temp), cont[i])
      cont[i] = change


      For loop's range
      Change your range call in the for loop to range(2, len(cont)):
      Now inside the loop you can access the current line with simply cont[i] it's more readable and efficient.



      As with i's range you are accessing refnum always with a +1. By initializing it as refnum=int(act.group()) +1 your code saves that operations inside the loop. Another thing that you can do is do the +=1 increment at the beginning of the loop.



      File management
      You don't need to manually close files when using with statement you can remove finalfile.close(). Another thing, you are using with in the when writing but not when reading, think about always use the same method.



      More things can be changed but I think that's enough for now.






      share|improve this answer










      New contributor




      Gardo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      $endgroup$
















        0












        0








        0





        $begingroup$

        You could use rumael.yaml, it can preserve comments. https://stackoverflow.com/questions/7255885/save-dump-a-yaml-file-with-comments-in-pyyaml#27103244



        Moreover, you want to be a better python developer (or maybe pythonist?) I can give you some tips:



        Content duplication
        You are storing the file content inside cont and after closing the file you are duplicating that info in a new variable newfile I think that is an unnecessary process in this situation. You could store all the data in cont and just modify the lines needed. You can replace the entire if-else by:



            if temp != (refnum+1):              
        temp= (refnum+1)
        change=numRegex.sub(str(temp), cont[i])
        cont[i] = change


        For loop's range
        Change your range call in the for loop to range(2, len(cont)):
        Now inside the loop you can access the current line with simply cont[i] it's more readable and efficient.



        As with i's range you are accessing refnum always with a +1. By initializing it as refnum=int(act.group()) +1 your code saves that operations inside the loop. Another thing that you can do is do the +=1 increment at the beginning of the loop.



        File management
        You don't need to manually close files when using with statement you can remove finalfile.close(). Another thing, you are using with in the when writing but not when reading, think about always use the same method.



        More things can be changed but I think that's enough for now.






        share|improve this answer










        New contributor




        Gardo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






        $endgroup$



        You could use rumael.yaml, it can preserve comments. https://stackoverflow.com/questions/7255885/save-dump-a-yaml-file-with-comments-in-pyyaml#27103244



        Moreover, you want to be a better python developer (or maybe pythonist?) I can give you some tips:



        Content duplication
        You are storing the file content inside cont and after closing the file you are duplicating that info in a new variable newfile I think that is an unnecessary process in this situation. You could store all the data in cont and just modify the lines needed. You can replace the entire if-else by:



            if temp != (refnum+1):              
        temp= (refnum+1)
        change=numRegex.sub(str(temp), cont[i])
        cont[i] = change


        For loop's range
        Change your range call in the for loop to range(2, len(cont)):
        Now inside the loop you can access the current line with simply cont[i] it's more readable and efficient.



        As with i's range you are accessing refnum always with a +1. By initializing it as refnum=int(act.group()) +1 your code saves that operations inside the loop. Another thing that you can do is do the +=1 increment at the beginning of the loop.



        File management
        You don't need to manually close files when using with statement you can remove finalfile.close(). Another thing, you are using with in the when writing but not when reading, think about always use the same method.



        More things can be changed but I think that's enough for now.







        share|improve this answer










        New contributor




        Gardo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        share|improve this answer



        share|improve this answer








        edited Mar 22 at 11:58





















        New contributor




        Gardo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        answered Mar 22 at 11:09









        GardoGardo

        161




        161




        New contributor




        Gardo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.





        New contributor





        Gardo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






        Gardo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Code Review Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215464%2fscript-that-fixes-yml-sids-to-be-simplified-for-production%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Webac Holding Inhaltsverzeichnis Geschichte | Organisationsstruktur | Tochterfirmen |...

            What's the meaning of a knight fighting a snail in medieval book illustrations?What is the meaning of a glove...

            Salamanca Inhaltsverzeichnis Lage und Klima | Bevölkerungsentwicklung | Geschichte | Kultur und...