Find all strings matching a giving regex pattern in files in a directory (including all...
How to acknowledge an embarrassing job interview, now that I work directly with the interviewer?
The effects of magnetism in radio transmissions
What is better: yes / no radio, or simple checkbox?
Why zero tolerance on nudity in space?
Isn't using the Extrusion Multiplier like cheating?
Why did this image turn out darker?
Why did other German political parties disband so fast when Hitler was appointed chancellor?
Why do members of Congress in committee hearings ask witnesses the same question multiple times?
Disable the ">" operator in Rstudio linux terminal
Dilemma of explaining to interviewer that he is the reason for declining second interview
Is it a fallacy if someone claims they need an explanation for every word of your argument to the point where they don't understand common terms?
Does fast page mode apply to ROM?
Can a hotel cancel a confirmed reservation?
How to tag distinct options/entities without giving any an implicit priority or suggested order?
What is the purpose of easy combat scenarios that don't need resource expenditure?
How to deal with an incendiary email that was recalled
Lick explanation
How to prevent users from executing commands through browser URL
A starship is travelling at 0.9c and collides with a small rock. Will it leave a clean hole through, or will more happen?
Citing paywalled articles accessed via illegal web sharing
Slow moving projectiles from a hand-held weapon - how do they reach the target?
What creature do these Alchemical Humonculus actions target?
If I delete my router's history can my ISP still provide it to my parents?
What is the wife of a henpecked husband called?
Find all strings matching a giving regex pattern in files in a directory (including all subdirectories)
Process files in all subdirectories and save output to new files based on their current pathFind a specific file, or find all executable files within the system pathMoving MP3 files from one directory to another using regexWrite MD5 hashes to file for all files in a directory treeFind files with content matching regexPattern matching (like regex)Python library for tio.run interactionFind all files in directory and subdirectories while ignoring noncritical exceptionsSpeed-cubing timer console applicationFind files by pattern and copy to target location
$begingroup$
#! python3
# `regexSearch`: Finds all lines matching a given regex in each file in a given folder.
# Usage:
# The directory to search and regex to be searched for are provided as a command line arguments.
# The 1st and 2nd command line arguments are the directory and regex pattern respectively.
# Script prompts the user to enter the regex.
# After completion, the user is prompted to continue
import re, sys
from os import path, listdir
def regex_search(regex, directory):
res, lst = {}, listdir(directory)
for itm in lst:
pth = path.join(path.abspath(directory), itm)
if path.isdir(pth): res.update(regex_search(regex, pth)) #Recursively traverse all sub directories.
else:
print(pth)
with open(pth) as file:
tmp = []
for idx, line in enumerate(file.readlines()):
results = regex.findall(line)
if results: tmp.extend([f"Line {idx+1}: {results}"])
res[pth] = tmp
return res
if __name__ == "__main__":
directory, pattern = sys.argv[1:3]
while not path.isdir(directory):
print("Error: Please input a valid path for an existing directory:", end = "t")
directory = input()
while True:
try:
regex = re.compile(pattern)
break
except TypeError:
print("Error: Please input a valid regex:", end = "t")
pattern = input()
except re.error:
print("Error: Please input a valid regex:", end = "t")
pattern = input()
matches = regex_search(regex, directory)
for key in matches: print(key, "n".join(matches[key]), sep="n", end="nn")
python python-3.x regex file-system
$endgroup$
add a comment |
$begingroup$
#! python3
# `regexSearch`: Finds all lines matching a given regex in each file in a given folder.
# Usage:
# The directory to search and regex to be searched for are provided as a command line arguments.
# The 1st and 2nd command line arguments are the directory and regex pattern respectively.
# Script prompts the user to enter the regex.
# After completion, the user is prompted to continue
import re, sys
from os import path, listdir
def regex_search(regex, directory):
res, lst = {}, listdir(directory)
for itm in lst:
pth = path.join(path.abspath(directory), itm)
if path.isdir(pth): res.update(regex_search(regex, pth)) #Recursively traverse all sub directories.
else:
print(pth)
with open(pth) as file:
tmp = []
for idx, line in enumerate(file.readlines()):
results = regex.findall(line)
if results: tmp.extend([f"Line {idx+1}: {results}"])
res[pth] = tmp
return res
if __name__ == "__main__":
directory, pattern = sys.argv[1:3]
while not path.isdir(directory):
print("Error: Please input a valid path for an existing directory:", end = "t")
directory = input()
while True:
try:
regex = re.compile(pattern)
break
except TypeError:
print("Error: Please input a valid regex:", end = "t")
pattern = input()
except re.error:
print("Error: Please input a valid regex:", end = "t")
pattern = input()
matches = regex_search(regex, directory)
for key in matches: print(key, "n".join(matches[key]), sep="n", end="nn")
python python-3.x regex file-system
$endgroup$
add a comment |
$begingroup$
#! python3
# `regexSearch`: Finds all lines matching a given regex in each file in a given folder.
# Usage:
# The directory to search and regex to be searched for are provided as a command line arguments.
# The 1st and 2nd command line arguments are the directory and regex pattern respectively.
# Script prompts the user to enter the regex.
# After completion, the user is prompted to continue
import re, sys
from os import path, listdir
def regex_search(regex, directory):
res, lst = {}, listdir(directory)
for itm in lst:
pth = path.join(path.abspath(directory), itm)
if path.isdir(pth): res.update(regex_search(regex, pth)) #Recursively traverse all sub directories.
else:
print(pth)
with open(pth) as file:
tmp = []
for idx, line in enumerate(file.readlines()):
results = regex.findall(line)
if results: tmp.extend([f"Line {idx+1}: {results}"])
res[pth] = tmp
return res
if __name__ == "__main__":
directory, pattern = sys.argv[1:3]
while not path.isdir(directory):
print("Error: Please input a valid path for an existing directory:", end = "t")
directory = input()
while True:
try:
regex = re.compile(pattern)
break
except TypeError:
print("Error: Please input a valid regex:", end = "t")
pattern = input()
except re.error:
print("Error: Please input a valid regex:", end = "t")
pattern = input()
matches = regex_search(regex, directory)
for key in matches: print(key, "n".join(matches[key]), sep="n", end="nn")
python python-3.x regex file-system
$endgroup$
#! python3
# `regexSearch`: Finds all lines matching a given regex in each file in a given folder.
# Usage:
# The directory to search and regex to be searched for are provided as a command line arguments.
# The 1st and 2nd command line arguments are the directory and regex pattern respectively.
# Script prompts the user to enter the regex.
# After completion, the user is prompted to continue
import re, sys
from os import path, listdir
def regex_search(regex, directory):
res, lst = {}, listdir(directory)
for itm in lst:
pth = path.join(path.abspath(directory), itm)
if path.isdir(pth): res.update(regex_search(regex, pth)) #Recursively traverse all sub directories.
else:
print(pth)
with open(pth) as file:
tmp = []
for idx, line in enumerate(file.readlines()):
results = regex.findall(line)
if results: tmp.extend([f"Line {idx+1}: {results}"])
res[pth] = tmp
return res
if __name__ == "__main__":
directory, pattern = sys.argv[1:3]
while not path.isdir(directory):
print("Error: Please input a valid path for an existing directory:", end = "t")
directory = input()
while True:
try:
regex = re.compile(pattern)
break
except TypeError:
print("Error: Please input a valid regex:", end = "t")
pattern = input()
except re.error:
print("Error: Please input a valid regex:", end = "t")
pattern = input()
matches = regex_search(regex, directory)
for key in matches: print(key, "n".join(matches[key]), sep="n", end="nn")
python python-3.x regex file-system
python python-3.x regex file-system
edited yesterday
Ludisposed
8,32722161
8,32722161
asked yesterday
Tobi AlafinTobi Alafin
40319
40319
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Some improvements
Style
Please indent your file properly, since indentation is important in Python, those lines like
if path.isdir(pth): res.update(regex_search(regex, pth))
Are frowned upon, instead do
if path.isdir(pth):
res.update(regex_search(regex, pth))
Use
glob
for listing files in a directory
With Python3.5+
glob
is the easiest way to list all files in a directory and subdirectory, before you should useos.walk()
Use generators when appropriate
This will save some memory space, as it doesn't have to append to the temporary list all the time
Use
argparse
oversys.argv[]
Argparse is the module for CLI input, easy to use and has a ton of features I definitely recommend it!
Code
import argparse
import glob
import re
import os
import pathlib
def regex_search(regex, directory):
for f in glob.glob(f"{directory}**/*.*", recursive=True):
with open(f) as _file:
for i, line in enumerate(_file.readlines()):
if regex.search(line):
yield f"In file {f} matched: {line.rstrip()} at position: {i}"
def parse_args():
parser = argparse.ArgumentParser(
usage='%(prog)s [options] <regex> <directory>',
formatter_class=argparse.RawDescriptionHelpFormatter
)
parser.add_argument('regex', type=str)
parser.add_argument('directory', type=str)
args = parser.parse_args()
try:
rgx = re.compile(args.regex)
except Exception as e:
parser.error('Regex does not compile')
directory = pathlib.Path(args.directory)
if not os.path.isdir(directory):
parser.error('Directory is not valid')
return rgx, directory
if __name__ == '__main__':
regex, directory = parse_args()
for match in regex_search(regex, directory):
print(match)
Bonus Round!
grep
is a Unix tool that can basically do this by default
grep -Hrn 'search term' path/to/dir
Where:
-H
prints the matching line
-r
Does a recursive search
-n
prints the line number
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214462%2ffind-all-strings-matching-a-giving-regex-pattern-in-files-in-a-directory-includ%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Some improvements
Style
Please indent your file properly, since indentation is important in Python, those lines like
if path.isdir(pth): res.update(regex_search(regex, pth))
Are frowned upon, instead do
if path.isdir(pth):
res.update(regex_search(regex, pth))
Use
glob
for listing files in a directory
With Python3.5+
glob
is the easiest way to list all files in a directory and subdirectory, before you should useos.walk()
Use generators when appropriate
This will save some memory space, as it doesn't have to append to the temporary list all the time
Use
argparse
oversys.argv[]
Argparse is the module for CLI input, easy to use and has a ton of features I definitely recommend it!
Code
import argparse
import glob
import re
import os
import pathlib
def regex_search(regex, directory):
for f in glob.glob(f"{directory}**/*.*", recursive=True):
with open(f) as _file:
for i, line in enumerate(_file.readlines()):
if regex.search(line):
yield f"In file {f} matched: {line.rstrip()} at position: {i}"
def parse_args():
parser = argparse.ArgumentParser(
usage='%(prog)s [options] <regex> <directory>',
formatter_class=argparse.RawDescriptionHelpFormatter
)
parser.add_argument('regex', type=str)
parser.add_argument('directory', type=str)
args = parser.parse_args()
try:
rgx = re.compile(args.regex)
except Exception as e:
parser.error('Regex does not compile')
directory = pathlib.Path(args.directory)
if not os.path.isdir(directory):
parser.error('Directory is not valid')
return rgx, directory
if __name__ == '__main__':
regex, directory = parse_args()
for match in regex_search(regex, directory):
print(match)
Bonus Round!
grep
is a Unix tool that can basically do this by default
grep -Hrn 'search term' path/to/dir
Where:
-H
prints the matching line
-r
Does a recursive search
-n
prints the line number
$endgroup$
add a comment |
$begingroup$
Some improvements
Style
Please indent your file properly, since indentation is important in Python, those lines like
if path.isdir(pth): res.update(regex_search(regex, pth))
Are frowned upon, instead do
if path.isdir(pth):
res.update(regex_search(regex, pth))
Use
glob
for listing files in a directory
With Python3.5+
glob
is the easiest way to list all files in a directory and subdirectory, before you should useos.walk()
Use generators when appropriate
This will save some memory space, as it doesn't have to append to the temporary list all the time
Use
argparse
oversys.argv[]
Argparse is the module for CLI input, easy to use and has a ton of features I definitely recommend it!
Code
import argparse
import glob
import re
import os
import pathlib
def regex_search(regex, directory):
for f in glob.glob(f"{directory}**/*.*", recursive=True):
with open(f) as _file:
for i, line in enumerate(_file.readlines()):
if regex.search(line):
yield f"In file {f} matched: {line.rstrip()} at position: {i}"
def parse_args():
parser = argparse.ArgumentParser(
usage='%(prog)s [options] <regex> <directory>',
formatter_class=argparse.RawDescriptionHelpFormatter
)
parser.add_argument('regex', type=str)
parser.add_argument('directory', type=str)
args = parser.parse_args()
try:
rgx = re.compile(args.regex)
except Exception as e:
parser.error('Regex does not compile')
directory = pathlib.Path(args.directory)
if not os.path.isdir(directory):
parser.error('Directory is not valid')
return rgx, directory
if __name__ == '__main__':
regex, directory = parse_args()
for match in regex_search(regex, directory):
print(match)
Bonus Round!
grep
is a Unix tool that can basically do this by default
grep -Hrn 'search term' path/to/dir
Where:
-H
prints the matching line
-r
Does a recursive search
-n
prints the line number
$endgroup$
add a comment |
$begingroup$
Some improvements
Style
Please indent your file properly, since indentation is important in Python, those lines like
if path.isdir(pth): res.update(regex_search(regex, pth))
Are frowned upon, instead do
if path.isdir(pth):
res.update(regex_search(regex, pth))
Use
glob
for listing files in a directory
With Python3.5+
glob
is the easiest way to list all files in a directory and subdirectory, before you should useos.walk()
Use generators when appropriate
This will save some memory space, as it doesn't have to append to the temporary list all the time
Use
argparse
oversys.argv[]
Argparse is the module for CLI input, easy to use and has a ton of features I definitely recommend it!
Code
import argparse
import glob
import re
import os
import pathlib
def regex_search(regex, directory):
for f in glob.glob(f"{directory}**/*.*", recursive=True):
with open(f) as _file:
for i, line in enumerate(_file.readlines()):
if regex.search(line):
yield f"In file {f} matched: {line.rstrip()} at position: {i}"
def parse_args():
parser = argparse.ArgumentParser(
usage='%(prog)s [options] <regex> <directory>',
formatter_class=argparse.RawDescriptionHelpFormatter
)
parser.add_argument('regex', type=str)
parser.add_argument('directory', type=str)
args = parser.parse_args()
try:
rgx = re.compile(args.regex)
except Exception as e:
parser.error('Regex does not compile')
directory = pathlib.Path(args.directory)
if not os.path.isdir(directory):
parser.error('Directory is not valid')
return rgx, directory
if __name__ == '__main__':
regex, directory = parse_args()
for match in regex_search(regex, directory):
print(match)
Bonus Round!
grep
is a Unix tool that can basically do this by default
grep -Hrn 'search term' path/to/dir
Where:
-H
prints the matching line
-r
Does a recursive search
-n
prints the line number
$endgroup$
Some improvements
Style
Please indent your file properly, since indentation is important in Python, those lines like
if path.isdir(pth): res.update(regex_search(regex, pth))
Are frowned upon, instead do
if path.isdir(pth):
res.update(regex_search(regex, pth))
Use
glob
for listing files in a directory
With Python3.5+
glob
is the easiest way to list all files in a directory and subdirectory, before you should useos.walk()
Use generators when appropriate
This will save some memory space, as it doesn't have to append to the temporary list all the time
Use
argparse
oversys.argv[]
Argparse is the module for CLI input, easy to use and has a ton of features I definitely recommend it!
Code
import argparse
import glob
import re
import os
import pathlib
def regex_search(regex, directory):
for f in glob.glob(f"{directory}**/*.*", recursive=True):
with open(f) as _file:
for i, line in enumerate(_file.readlines()):
if regex.search(line):
yield f"In file {f} matched: {line.rstrip()} at position: {i}"
def parse_args():
parser = argparse.ArgumentParser(
usage='%(prog)s [options] <regex> <directory>',
formatter_class=argparse.RawDescriptionHelpFormatter
)
parser.add_argument('regex', type=str)
parser.add_argument('directory', type=str)
args = parser.parse_args()
try:
rgx = re.compile(args.regex)
except Exception as e:
parser.error('Regex does not compile')
directory = pathlib.Path(args.directory)
if not os.path.isdir(directory):
parser.error('Directory is not valid')
return rgx, directory
if __name__ == '__main__':
regex, directory = parse_args()
for match in regex_search(regex, directory):
print(match)
Bonus Round!
grep
is a Unix tool that can basically do this by default
grep -Hrn 'search term' path/to/dir
Where:
-H
prints the matching line
-r
Does a recursive search
-n
prints the line number
edited yesterday
answered yesterday
LudisposedLudisposed
8,32722161
8,32722161
add a comment |
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214462%2ffind-all-strings-matching-a-giving-regex-pattern-in-files-in-a-directory-includ%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown