Parsing a string of key-value pairs as a dictionary“Multi-key” dictionaryMatrix Multiplication Python

Parsing a string of key-value pairs as a dictionary“Multi-key” dictionaryMatrix Multiplication Python —...

How should I handle players who ignore the session zero agreement?

Why does String.replaceAll() work differently in Java 8 from Java 9?

Avoiding morning and evening handshakes

Enable Advanced Currency Management using CLI

Parsing a string of key-value pairs as a dictionary

Showing size of pie chart in legend of QGIS?

Why do members of Congress in committee hearings ask witnesses the same question multiple times?

Word or phrase for showing great skill at something without formal training in it

What is the wife of a henpecked husband called?

What does Cypher mean when he says Neo is "gonna pop"?

Why did this image turn out darker?

Checking for the existence of multiple directories

Can I become debt free or should I file for bankruptcy? How do I manage my debt and finances?

What is a jet (unit) shown in Windows 10 calculator?

What is better: yes / no radio, or simple checkbox?

Solving Fredholm Equation of the second kind

Eww, those bytes are gross

Citing paywalled articles accessed via illegal web sharing

Why zero tolerance on nudity in space?

How would one buy a used TIE Fighter or X-Wing?

Program that converts a number to a letter of the alphabet

Why would the Pakistan airspace closure cancel flights not headed to Pakistan itself?

Monthly Patch Releases for Linux CentOS/RedHat

Can a person refuse a presidential pardon?

Parsing a string of key-value pairs as a dictionary

“Multi-key” dictionaryMatrix Multiplication Python — Memory HungrySearch dictionary by valueLoad recurring (but not strictly identical) sets of Key, Values into a DataFrame from text filesInitializing and populating a Python dictionary, key -> ListList all possible permutations from a python dictionary of listsSort dictionary by increasing length of its valuesInvert a dictionary to a dictionary of listsAccessing a list of dictionaries in a list of dictionariesPytest fixture for testing a vertex-parsing function

I always use nested list and dictionary comprehension for unstructured data and this is a common way I use it.

In [14]: data = """

41:n

43:n

44:n

46:n

47:n

49:n

50:n

51:n

52:n

53:n

54:n

55:cm

56:n

57:n

58:n"""

In [15]: {int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}

Out [15]:

{41: 'n',

 43: 'n',

 44: 'n',

 46: 'n',

 47: 'n',

 49: 'n',

 50: 'n',

 51: 'n',

 52: 'n',

 53: 'n',

 54: 'n',

 55: 'cm',

 56: 'n',

 57: 'n',

 58: 'n'}

Here I am doing line.split(":")[0] three times. Is there any better way to do this?

edited 17 hours ago

200_success

130k16153417

asked 21 hours ago

Rahul Patel

265413

1

$begingroup$
This would benefit from a better description of "unstructured data". The presented example is very well structured and could be eval'd as a dict with only minor changes.
$endgroup$
– TemporalWolf
6 hours ago

add a comment |

I always use nested list and dictionary comprehension for unstructured data and this is a common way I use it.

In [14]: data = """

41:n

43:n

44:n

46:n

47:n

49:n

50:n

51:n

52:n

53:n

54:n

55:cm

56:n

57:n

58:n"""

In [15]: {int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}

Out [15]:

{41: 'n',

 43: 'n',

 44: 'n',

 46: 'n',

 47: 'n',

 49: 'n',

 50: 'n',

 51: 'n',

 52: 'n',

 53: 'n',

 54: 'n',

 55: 'cm',

 56: 'n',

 57: 'n',

 58: 'n'}

Here I am doing line.split(":")[0] three times. Is there any better way to do this?

edited 17 hours ago

200_success

130k16153417

asked 21 hours ago

Rahul Patel

265413

1

$begingroup$
This would benefit from a better description of "unstructured data". The presented example is very well structured and could be eval'd as a dict with only minor changes.
$endgroup$
– TemporalWolf
6 hours ago

add a comment |

I always use nested list and dictionary comprehension for unstructured data and this is a common way I use it.

In [14]: data = """

41:n

43:n

44:n

46:n

47:n

49:n

50:n

51:n

52:n

53:n

54:n

55:cm

56:n

57:n

58:n"""

In [15]: {int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}

Out [15]:

{41: 'n',

 43: 'n',

 44: 'n',

 46: 'n',

 47: 'n',

 49: 'n',

 50: 'n',

 51: 'n',

 52: 'n',

 53: 'n',

 54: 'n',

 55: 'cm',

 56: 'n',

 57: 'n',

 58: 'n'}

Here I am doing line.split(":")[0] three times. Is there any better way to do this?

edited 17 hours ago

200_success

130k16153417

asked 21 hours ago

Rahul Patel

265413

I always use nested list and dictionary comprehension for unstructured data and this is a common way I use it.

In [14]: data = """

41:n

43:n

44:n

46:n

47:n

49:n

50:n

51:n

52:n

53:n

54:n

55:cm

56:n

57:n

58:n"""

In [15]: {int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}

Out [15]:

{41: 'n',

 43: 'n',

 44: 'n',

 46: 'n',

 47: 'n',

 49: 'n',

 50: 'n',

 51: 'n',

 52: 'n',

 53: 'n',

 54: 'n',

 55: 'cm',

 56: 'n',

 57: 'n',

 58: 'n'}

Here I am doing line.split(":")[0] three times. Is there any better way to do this?

python python-3.x parsing dictionary

edited 17 hours ago

200_success

130k16153417

asked 21 hours ago

Rahul Patel

265413

edited 17 hours ago

200_success

130k16153417

asked 21 hours ago

Rahul Patel

265413

edited 17 hours ago

200_success

130k16153417

edited 17 hours ago

200_success

130k16153417

edited 17 hours ago

200_success

130k16153417

asked 21 hours ago

Rahul Patel

265413

asked 21 hours ago

Rahul Patel

265413

asked 21 hours ago

Rahul Patel

265413

1

$begingroup$
This would benefit from a better description of "unstructured data". The presented example is very well structured and could be eval'd as a dict with only minor changes.
$endgroup$
– TemporalWolf
6 hours ago

add a comment |

1

$begingroup$
This would benefit from a better description of "unstructured data". The presented example is very well structured and could be eval'd as a dict with only minor changes.
$endgroup$
– TemporalWolf
6 hours ago

This would benefit from a better description of "unstructured data". The presented example is very well structured and could be eval'd as a dict with only minor changes.

– TemporalWolf
6 hours ago

add a comment |

4 Answers
4

active

oldest

votes

There's nothing wrong with the solution you have come with, but if you want an alternative, regex might come in handy here:

In [10]: import re

In [11]: data = """ 

    ...: 41:n 

    ...: 43:n 

    ...: 44:n 

    ...: 46:n 

    ...: 47:n 

    ...: 49:n 

    ...: 50:n 

    ...: 51:n 

    ...: 52:n 

    ...: 53:n 

    ...: 54:n 

    ...: 55:cm 

    ...: 56:n 

    ...: 57:n 

    ...: 58:n"""                                                                                                                                                                                                                                                         



In [12]: dict(re.findall(r'(d+):(.*)', data))                                                                                                                                                                                                                           

Out[12]: 

{'41': 'n',

 '43': 'n',

 '44': 'n',

 '46': 'n',

 '47': 'n',

 '49': 'n',

 '50': 'n',

 '51': 'n',

 '52': 'n',

 '53': 'n',

 '54': 'n',

 '55': 'cm',

 '56': 'n',

 '57': 'n',

 '58': 'n'}

Explanation:

1st Capturing Group (d+):

d+ - matches a digit (equal to [0-9])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
: matches the character : literally (case sensitive)

2nd Capturing Group (.*):

.* matches any character (except for line terminators)
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)

If there might be letters in the first matching group (though I doubt it since your casting that to an int), you might want to use:

dict(re.findall(r'(.*):(.*)', data))

I usually prefer using split()s over regexes because I feel like I have more control over the functionality of the code.

You might ask, why would you want to use the more complicated and verbose syntax of regular expressions rather than the more intuitive and simple string methods? Sometimes, the advantage is that regular expressions offer far more flexibility.

Regarding the comment of @Rahul regarding speed I'd say it depends:

Although string manipulation will usually be somewhat faster, the actual performance heavily depends on a number of factors, including:

How many times you parse the regex

How cleverly you write your string code

Whether the regex is precompiled

As the regex gets more complicated, it will take much more effort and complexity to write equivlent string manipulation code that performs well.

As far as I can tell, string operations will almost always beat regular expressions. But the more complex it gets, the harder it will be that string operations can keep up not only in performance matters but also regarding maintenance.

edited 17 hours ago

answered 17 hours ago

яүυк

7,16122054

$begingroup$
Yeah. I think regexes are slow too.
$endgroup$
– Rahul Patel
17 hours ago

add a comment |

Note it is much easier to read if you chop up the comprehension into blocks, instead of having them all on one line

You could use unpacking to remove some usages of line.split

>>> {

...    int(k): v

...    for line in data.split() 

...    for k, v in (line.split(':'),)

... }

{41: 'n', 43: 'n', 44: 'n', 46: 'n', 47: 'n', 49: 'n', 50: 'n', 51: 'n', 52: 'n', 53: 'n', 54: 'n', 55: 'cm', 56: 'n', 57: 'n', 58: 'n'}

Or if the first argument can be of str type you could use dict().

This will unpack the line.split and convert them into a key, value pair for you

>>> dict(

...    line.split(':') 

...    for line in data.split() 

... )

{'41': 'n', '43': 'n', '44': 'n', '46': 'n', '47': 'n', '49': 'n', '50': 'n', '51': 'n', '52': 'n', '53': 'n', '54': 'n', '55': 'cm', '56': 'n', '57': 'n', '58': 'n'}

edited 12 hours ago

answered 14 hours ago

Ludisposed

8,32722161

add a comment |

Your string looks very similar to the YAML syntax. Indeed it is almost valid syntax for an associative list, there are only spaces missing after the :. So, why not use a YAML parser?

import yaml



data = """

41:n

43:n

44:n

46:n

47:n

49:n

50:n

51:n

52:n

53:n

54:n

55:cm

56:n

57:n

58:n"""



print(yaml.load(data.replace(":", ": ")))

# {41: 'n',

#  43: 'n',

#  44: 'n',

#  46: 'n',

#  47: 'n',

#  49: 'n',

#  50: 'n',

#  51: 'n',

#  52: 'n',

#  53: 'n',

#  54: 'n',

#  55: 'cm',

#  56: 'n',

#  57: 'n',

#  58: 'n'}

You might have to install it first, which you can do via pip install yaml.

answered 9 hours ago

Graipher

25.2k53687

add a comment |

You have too much logic in the dict comprehension:

{int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}

First of all, let's expand it to a normal for-loop:

>>> result = {}

>>> for line in data.split("n"):

...     if len(line.split(":"))==2:

...         result[int(line.split(":")[0])] = line.split(":")[1]

>>> result

I can see that you use the following check if len(line.split(":"))==2: to eliminate the first blank space from the data.split("n"):

>>> data.split("n")

['',

 '41:n',

 '43:n',

 ...

 '58:n']

But the docs for str.split advice to use str.split() without specifying a sep parameter if you wanna discard the empty string at the beginning:

>>> data.split()

['41:n',

 '43:n',

 ...

 '58:n']

So, now we can remove unnecessary check from your code:

>>> result = {}

>>> for line in data.split():

...     result[int(line.split(":")[0])] = line.split(":")[1]

>>> result

Here you calculate line.split(":") twice. Take it out:

>>> result = {}

>>> for line in data.split():

...    key, value = line.split(":")

...    result[int(key)] = value

>>> result

This is the most basic version. Don't put it back to a dict comprehension as it will still look quite complex. But you could make a function out of it. For example, something like this:

>>> def to_key_value(line, sep=':'):

...     key, value = line.split(sep)

...     return int(key), value



>>> dict(map(to_key_value, data.split()))

{41: 'n',

 43: 'n',

 ...

 58: 'n'}

Another option that I came up with:

>>> from functools import partial

>>> lines = data.split()

>>> split_by_colon = partial(str.split, sep=':')

>>> key_value_pairs = map(split_by_colon, lines)

>>> {int(key): value for key, value in key_value_pairs}

{41: 'n',

 43: 'n',

 ...

 58: 'n'}

Also, if you don't want to keep in memory a list of results from data.split, you might find this helpful: Is there a generator version of string.split() in Python?

answered 13 hours ago

Georgy

1,0462520

$begingroup$
I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
$endgroup$
– Rahul Patel
13 hours ago

$begingroup$
@RahulPatel: You might want to learn constructive criticism and some diplomacy ;) Georgy was nice enough to spend time on your problem...
$endgroup$
– Eric Duminil
4 hours ago

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214510%2fparsing-a-string-of-key-value-pairs-as-a-dictionary%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

There's nothing wrong with the solution you have come with, but if you want an alternative, regex might come in handy here:

In [10]: import re

In [11]: data = """ 

    ...: 41:n 

    ...: 43:n 

    ...: 44:n 

    ...: 46:n 

    ...: 47:n 

    ...: 49:n 

    ...: 50:n 

    ...: 51:n 

    ...: 52:n 

    ...: 53:n 

    ...: 54:n 

    ...: 55:cm 

    ...: 56:n 

    ...: 57:n 

    ...: 58:n"""                                                                                                                                                                                                                                                         



In [12]: dict(re.findall(r'(d+):(.*)', data))                                                                                                                                                                                                                           

Out[12]: 

{'41': 'n',

 '43': 'n',

 '44': 'n',

 '46': 'n',

 '47': 'n',

 '49': 'n',

 '50': 'n',

 '51': 'n',

 '52': 'n',

 '53': 'n',

 '54': 'n',

 '55': 'cm',

 '56': 'n',

 '57': 'n',

 '58': 'n'}

Explanation:

1st Capturing Group (d+):

2nd Capturing Group (.*):

.* matches any character (except for line terminators)
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)

If there might be letters in the first matching group (though I doubt it since your casting that to an int), you might want to use:

dict(re.findall(r'(.*):(.*)', data))

I usually prefer using split()s over regexes because I feel like I have more control over the functionality of the code.

Regarding the comment of @Rahul regarding speed I'd say it depends:

Although string manipulation will usually be somewhat faster, the actual performance heavily depends on a number of factors, including:

How many times you parse the regex

How cleverly you write your string code

Whether the regex is precompiled

As the regex gets more complicated, it will take much more effort and complexity to write equivlent string manipulation code that performs well.

edited 17 hours ago

answered 17 hours ago

яүυк

7,16122054

$begingroup$
Yeah. I think regexes are slow too.
$endgroup$
– Rahul Patel
17 hours ago

add a comment |

There's nothing wrong with the solution you have come with, but if you want an alternative, regex might come in handy here:

In [10]: import re

In [11]: data = """ 

    ...: 41:n 

    ...: 43:n 

    ...: 44:n 

    ...: 46:n 

    ...: 47:n 

    ...: 49:n 

    ...: 50:n 

    ...: 51:n 

    ...: 52:n 

    ...: 53:n 

    ...: 54:n 

    ...: 55:cm 

    ...: 56:n 

    ...: 57:n 

    ...: 58:n"""                                                                                                                                                                                                                                                         



In [12]: dict(re.findall(r'(d+):(.*)', data))                                                                                                                                                                                                                           

Out[12]: 

{'41': 'n',

 '43': 'n',

 '44': 'n',

 '46': 'n',

 '47': 'n',

 '49': 'n',

 '50': 'n',

 '51': 'n',

 '52': 'n',

 '53': 'n',

 '54': 'n',

 '55': 'cm',

 '56': 'n',

 '57': 'n',

 '58': 'n'}

Explanation:

1st Capturing Group (d+):

2nd Capturing Group (.*):

.* matches any character (except for line terminators)
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)

If there might be letters in the first matching group (though I doubt it since your casting that to an int), you might want to use:

dict(re.findall(r'(.*):(.*)', data))

I usually prefer using split()s over regexes because I feel like I have more control over the functionality of the code.

Regarding the comment of @Rahul regarding speed I'd say it depends:

Although string manipulation will usually be somewhat faster, the actual performance heavily depends on a number of factors, including:

How many times you parse the regex

How cleverly you write your string code

Whether the regex is precompiled

As the regex gets more complicated, it will take much more effort and complexity to write equivlent string manipulation code that performs well.

edited 17 hours ago

answered 17 hours ago

яүυк

7,16122054

$begingroup$
Yeah. I think regexes are slow too.
$endgroup$
– Rahul Patel
17 hours ago

add a comment |

There's nothing wrong with the solution you have come with, but if you want an alternative, regex might come in handy here:

In [10]: import re

In [11]: data = """ 

    ...: 41:n 

    ...: 43:n 

    ...: 44:n 

    ...: 46:n 

    ...: 47:n 

    ...: 49:n 

    ...: 50:n 

    ...: 51:n 

    ...: 52:n 

    ...: 53:n 

    ...: 54:n 

    ...: 55:cm 

    ...: 56:n 

    ...: 57:n 

    ...: 58:n"""                                                                                                                                                                                                                                                         



In [12]: dict(re.findall(r'(d+):(.*)', data))                                                                                                                                                                                                                           

Out[12]: 

{'41': 'n',

 '43': 'n',

 '44': 'n',

 '46': 'n',

 '47': 'n',

 '49': 'n',

 '50': 'n',

 '51': 'n',

 '52': 'n',

 '53': 'n',

 '54': 'n',

 '55': 'cm',

 '56': 'n',

 '57': 'n',

 '58': 'n'}

Explanation:

1st Capturing Group (d+):

2nd Capturing Group (.*):

.* matches any character (except for line terminators)
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)

If there might be letters in the first matching group (though I doubt it since your casting that to an int), you might want to use:

dict(re.findall(r'(.*):(.*)', data))

I usually prefer using split()s over regexes because I feel like I have more control over the functionality of the code.

Regarding the comment of @Rahul regarding speed I'd say it depends:

Although string manipulation will usually be somewhat faster, the actual performance heavily depends on a number of factors, including:

How many times you parse the regex

How cleverly you write your string code

Whether the regex is precompiled

As the regex gets more complicated, it will take much more effort and complexity to write equivlent string manipulation code that performs well.

edited 17 hours ago

answered 17 hours ago

яүυк

7,16122054

There's nothing wrong with the solution you have come with, but if you want an alternative, regex might come in handy here:

In [10]: import re

In [11]: data = """ 

    ...: 41:n 

    ...: 43:n 

    ...: 44:n 

    ...: 46:n 

    ...: 47:n 

    ...: 49:n 

    ...: 50:n 

    ...: 51:n 

    ...: 52:n 

    ...: 53:n 

    ...: 54:n 

    ...: 55:cm 

    ...: 56:n 

    ...: 57:n 

    ...: 58:n"""                                                                                                                                                                                                                                                         



In [12]: dict(re.findall(r'(d+):(.*)', data))                                                                                                                                                                                                                           

Out[12]: 

{'41': 'n',

 '43': 'n',

 '44': 'n',

 '46': 'n',

 '47': 'n',

 '49': 'n',

 '50': 'n',

 '51': 'n',

 '52': 'n',

 '53': 'n',

 '54': 'n',

 '55': 'cm',

 '56': 'n',

 '57': 'n',

 '58': 'n'}

Explanation:

1st Capturing Group (d+):

2nd Capturing Group (.*):

.* matches any character (except for line terminators)
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)

If there might be letters in the first matching group (though I doubt it since your casting that to an int), you might want to use:

dict(re.findall(r'(.*):(.*)', data))

I usually prefer using split()s over regexes because I feel like I have more control over the functionality of the code.

Regarding the comment of @Rahul regarding speed I'd say it depends:

Although string manipulation will usually be somewhat faster, the actual performance heavily depends on a number of factors, including:

How many times you parse the regex

How cleverly you write your string code

Whether the regex is precompiled

As the regex gets more complicated, it will take much more effort and complexity to write equivlent string manipulation code that performs well.

edited 17 hours ago

answered 17 hours ago

яүυк

7,16122054

edited 17 hours ago

answered 17 hours ago

яүυк

7,16122054

answered 17 hours ago

яүυк

7,16122054

answered 17 hours ago

яүυк

7,16122054

$begingroup$
Yeah. I think regexes are slow too.
$endgroup$
– Rahul Patel
17 hours ago

add a comment |

$begingroup$
Yeah. I think regexes are slow too.
$endgroup$
– Rahul Patel
17 hours ago

Yeah. I think regexes are slow too.

– Rahul Patel
17 hours ago

add a comment |

Note it is much easier to read if you chop up the comprehension into blocks, instead of having them all on one line

You could use unpacking to remove some usages of line.split

>>> {

...    int(k): v

...    for line in data.split() 

...    for k, v in (line.split(':'),)

... }

{41: 'n', 43: 'n', 44: 'n', 46: 'n', 47: 'n', 49: 'n', 50: 'n', 51: 'n', 52: 'n', 53: 'n', 54: 'n', 55: 'cm', 56: 'n', 57: 'n', 58: 'n'}

Or if the first argument can be of str type you could use dict().

This will unpack the line.split and convert them into a key, value pair for you

>>> dict(

...    line.split(':') 

...    for line in data.split() 

... )

{'41': 'n', '43': 'n', '44': 'n', '46': 'n', '47': 'n', '49': 'n', '50': 'n', '51': 'n', '52': 'n', '53': 'n', '54': 'n', '55': 'cm', '56': 'n', '57': 'n', '58': 'n'}

edited 12 hours ago

answered 14 hours ago

Ludisposed

8,32722161

add a comment |

Note it is much easier to read if you chop up the comprehension into blocks, instead of having them all on one line

You could use unpacking to remove some usages of line.split

>>> {

...    int(k): v

...    for line in data.split() 

...    for k, v in (line.split(':'),)

... }

{41: 'n', 43: 'n', 44: 'n', 46: 'n', 47: 'n', 49: 'n', 50: 'n', 51: 'n', 52: 'n', 53: 'n', 54: 'n', 55: 'cm', 56: 'n', 57: 'n', 58: 'n'}

Or if the first argument can be of str type you could use dict().

This will unpack the line.split and convert them into a key, value pair for you

>>> dict(

...    line.split(':') 

...    for line in data.split() 

... )

{'41': 'n', '43': 'n', '44': 'n', '46': 'n', '47': 'n', '49': 'n', '50': 'n', '51': 'n', '52': 'n', '53': 'n', '54': 'n', '55': 'cm', '56': 'n', '57': 'n', '58': 'n'}

edited 12 hours ago

answered 14 hours ago

Ludisposed

8,32722161

add a comment |

Note it is much easier to read if you chop up the comprehension into blocks, instead of having them all on one line

You could use unpacking to remove some usages of line.split

>>> {

...    int(k): v

...    for line in data.split() 

...    for k, v in (line.split(':'),)

... }

{41: 'n', 43: 'n', 44: 'n', 46: 'n', 47: 'n', 49: 'n', 50: 'n', 51: 'n', 52: 'n', 53: 'n', 54: 'n', 55: 'cm', 56: 'n', 57: 'n', 58: 'n'}

Or if the first argument can be of str type you could use dict().

This will unpack the line.split and convert them into a key, value pair for you

>>> dict(

...    line.split(':') 

...    for line in data.split() 

... )

{'41': 'n', '43': 'n', '44': 'n', '46': 'n', '47': 'n', '49': 'n', '50': 'n', '51': 'n', '52': 'n', '53': 'n', '54': 'n', '55': 'cm', '56': 'n', '57': 'n', '58': 'n'}

edited 12 hours ago

answered 14 hours ago

Ludisposed

8,32722161

Note it is much easier to read if you chop up the comprehension into blocks, instead of having them all on one line

You could use unpacking to remove some usages of line.split

>>> {

...    int(k): v

...    for line in data.split() 

...    for k, v in (line.split(':'),)

... }

{41: 'n', 43: 'n', 44: 'n', 46: 'n', 47: 'n', 49: 'n', 50: 'n', 51: 'n', 52: 'n', 53: 'n', 54: 'n', 55: 'cm', 56: 'n', 57: 'n', 58: 'n'}

Or if the first argument can be of str type you could use dict().

This will unpack the line.split and convert them into a key, value pair for you

>>> dict(

...    line.split(':') 

...    for line in data.split() 

... )

{'41': 'n', '43': 'n', '44': 'n', '46': 'n', '47': 'n', '49': 'n', '50': 'n', '51': 'n', '52': 'n', '53': 'n', '54': 'n', '55': 'cm', '56': 'n', '57': 'n', '58': 'n'}

edited 12 hours ago

answered 14 hours ago

Ludisposed

8,32722161

edited 12 hours ago

answered 14 hours ago

Ludisposed

8,32722161

answered 14 hours ago

Ludisposed

8,32722161

answered 14 hours ago

Ludisposed

8,32722161

add a comment |

Your string looks very similar to the YAML syntax. Indeed it is almost valid syntax for an associative list, there are only spaces missing after the :. So, why not use a YAML parser?

import yaml



data = """

41:n

43:n

44:n

46:n

47:n

49:n

50:n

51:n

52:n

53:n

54:n

55:cm

56:n

57:n

58:n"""



print(yaml.load(data.replace(":", ": ")))

# {41: 'n',

#  43: 'n',

#  44: 'n',

#  46: 'n',

#  47: 'n',

#  49: 'n',

#  50: 'n',

#  51: 'n',

#  52: 'n',

#  53: 'n',

#  54: 'n',

#  55: 'cm',

#  56: 'n',

#  57: 'n',

#  58: 'n'}

You might have to install it first, which you can do via pip install yaml.

answered 9 hours ago

Graipher

25.2k53687

add a comment |

Your string looks very similar to the YAML syntax. Indeed it is almost valid syntax for an associative list, there are only spaces missing after the :. So, why not use a YAML parser?

import yaml



data = """

41:n

43:n

44:n

46:n

47:n

49:n

50:n

51:n

52:n

53:n

54:n

55:cm

56:n

57:n

58:n"""



print(yaml.load(data.replace(":", ": ")))

# {41: 'n',

#  43: 'n',

#  44: 'n',

#  46: 'n',

#  47: 'n',

#  49: 'n',

#  50: 'n',

#  51: 'n',

#  52: 'n',

#  53: 'n',

#  54: 'n',

#  55: 'cm',

#  56: 'n',

#  57: 'n',

#  58: 'n'}

You might have to install it first, which you can do via pip install yaml.

answered 9 hours ago

Graipher

25.2k53687

add a comment |

Your string looks very similar to the YAML syntax. Indeed it is almost valid syntax for an associative list, there are only spaces missing after the :. So, why not use a YAML parser?

import yaml



data = """

41:n

43:n

44:n

46:n

47:n

49:n

50:n

51:n

52:n

53:n

54:n

55:cm

56:n

57:n

58:n"""



print(yaml.load(data.replace(":", ": ")))

# {41: 'n',

#  43: 'n',

#  44: 'n',

#  46: 'n',

#  47: 'n',

#  49: 'n',

#  50: 'n',

#  51: 'n',

#  52: 'n',

#  53: 'n',

#  54: 'n',

#  55: 'cm',

#  56: 'n',

#  57: 'n',

#  58: 'n'}

You might have to install it first, which you can do via pip install yaml.

answered 9 hours ago

Graipher

25.2k53687

Your string looks very similar to the YAML syntax. Indeed it is almost valid syntax for an associative list, there are only spaces missing after the :. So, why not use a YAML parser?

import yaml



data = """

41:n

43:n

44:n

46:n

47:n

49:n

50:n

51:n

52:n

53:n

54:n

55:cm

56:n

57:n

58:n"""



print(yaml.load(data.replace(":", ": ")))

# {41: 'n',

#  43: 'n',

#  44: 'n',

#  46: 'n',

#  47: 'n',

#  49: 'n',

#  50: 'n',

#  51: 'n',

#  52: 'n',

#  53: 'n',

#  54: 'n',

#  55: 'cm',

#  56: 'n',

#  57: 'n',

#  58: 'n'}

You might have to install it first, which you can do via pip install yaml.

answered 9 hours ago

Graipher

25.2k53687

answered 9 hours ago

Graipher

25.2k53687

answered 9 hours ago

Graipher

25.2k53687

answered 9 hours ago

Graipher

25.2k53687

add a comment |

You have too much logic in the dict comprehension:

{int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}

First of all, let's expand it to a normal for-loop:

>>> result = {}

>>> for line in data.split("n"):

...     if len(line.split(":"))==2:

...         result[int(line.split(":")[0])] = line.split(":")[1]

>>> result

I can see that you use the following check if len(line.split(":"))==2: to eliminate the first blank space from the data.split("n"):

>>> data.split("n")

['',

 '41:n',

 '43:n',

 ...

 '58:n']

But the docs for str.split advice to use str.split() without specifying a sep parameter if you wanna discard the empty string at the beginning:

>>> data.split()

['41:n',

 '43:n',

 ...

 '58:n']

So, now we can remove unnecessary check from your code:

>>> result = {}

>>> for line in data.split():

...     result[int(line.split(":")[0])] = line.split(":")[1]

>>> result

Here you calculate line.split(":") twice. Take it out:

>>> result = {}

>>> for line in data.split():

...    key, value = line.split(":")

...    result[int(key)] = value

>>> result

This is the most basic version. Don't put it back to a dict comprehension as it will still look quite complex. But you could make a function out of it. For example, something like this:

>>> def to_key_value(line, sep=':'):

...     key, value = line.split(sep)

...     return int(key), value



>>> dict(map(to_key_value, data.split()))

{41: 'n',

 43: 'n',

 ...

 58: 'n'}

Another option that I came up with:

>>> from functools import partial

>>> lines = data.split()

>>> split_by_colon = partial(str.split, sep=':')

>>> key_value_pairs = map(split_by_colon, lines)

>>> {int(key): value for key, value in key_value_pairs}

{41: 'n',

 43: 'n',

 ...

 58: 'n'}

Also, if you don't want to keep in memory a list of results from data.split, you might find this helpful: Is there a generator version of string.split() in Python?

answered 13 hours ago

Georgy

1,0462520

$begingroup$
I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
$endgroup$
– Rahul Patel
13 hours ago

$begingroup$
@RahulPatel: You might want to learn constructive criticism and some diplomacy ;) Georgy was nice enough to spend time on your problem...
$endgroup$
– Eric Duminil
4 hours ago

add a comment |

You have too much logic in the dict comprehension:

{int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}

First of all, let's expand it to a normal for-loop:

>>> result = {}

>>> for line in data.split("n"):

...     if len(line.split(":"))==2:

...         result[int(line.split(":")[0])] = line.split(":")[1]

>>> result

I can see that you use the following check if len(line.split(":"))==2: to eliminate the first blank space from the data.split("n"):

>>> data.split("n")

['',

 '41:n',

 '43:n',

 ...

 '58:n']

But the docs for str.split advice to use str.split() without specifying a sep parameter if you wanna discard the empty string at the beginning:

>>> data.split()

['41:n',

 '43:n',

 ...

 '58:n']

So, now we can remove unnecessary check from your code:

>>> result = {}

>>> for line in data.split():

...     result[int(line.split(":")[0])] = line.split(":")[1]

>>> result

Here you calculate line.split(":") twice. Take it out:

>>> result = {}

>>> for line in data.split():

...    key, value = line.split(":")

...    result[int(key)] = value

>>> result

This is the most basic version. Don't put it back to a dict comprehension as it will still look quite complex. But you could make a function out of it. For example, something like this:

>>> def to_key_value(line, sep=':'):

...     key, value = line.split(sep)

...     return int(key), value



>>> dict(map(to_key_value, data.split()))

{41: 'n',

 43: 'n',

 ...

 58: 'n'}

Another option that I came up with:

>>> from functools import partial

>>> lines = data.split()

>>> split_by_colon = partial(str.split, sep=':')

>>> key_value_pairs = map(split_by_colon, lines)

>>> {int(key): value for key, value in key_value_pairs}

{41: 'n',

 43: 'n',

 ...

 58: 'n'}

Also, if you don't want to keep in memory a list of results from data.split, you might find this helpful: Is there a generator version of string.split() in Python?

answered 13 hours ago

Georgy

1,0462520

$begingroup$
I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
$endgroup$
– Rahul Patel
13 hours ago

$begingroup$
@RahulPatel: You might want to learn constructive criticism and some diplomacy ;) Georgy was nice enough to spend time on your problem...
$endgroup$
– Eric Duminil
4 hours ago

add a comment |

You have too much logic in the dict comprehension:

{int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}

First of all, let's expand it to a normal for-loop:

>>> result = {}

>>> for line in data.split("n"):

...     if len(line.split(":"))==2:

...         result[int(line.split(":")[0])] = line.split(":")[1]

>>> result

I can see that you use the following check if len(line.split(":"))==2: to eliminate the first blank space from the data.split("n"):

>>> data.split("n")

['',

 '41:n',

 '43:n',

 ...

 '58:n']

But the docs for str.split advice to use str.split() without specifying a sep parameter if you wanna discard the empty string at the beginning:

>>> data.split()

['41:n',

 '43:n',

 ...

 '58:n']

So, now we can remove unnecessary check from your code:

>>> result = {}

>>> for line in data.split():

...     result[int(line.split(":")[0])] = line.split(":")[1]

>>> result

Here you calculate line.split(":") twice. Take it out:

>>> result = {}

>>> for line in data.split():

...    key, value = line.split(":")

...    result[int(key)] = value

>>> result

This is the most basic version. Don't put it back to a dict comprehension as it will still look quite complex. But you could make a function out of it. For example, something like this:

>>> def to_key_value(line, sep=':'):

...     key, value = line.split(sep)

...     return int(key), value



>>> dict(map(to_key_value, data.split()))

{41: 'n',

 43: 'n',

 ...

 58: 'n'}

Another option that I came up with:

>>> from functools import partial

>>> lines = data.split()

>>> split_by_colon = partial(str.split, sep=':')

>>> key_value_pairs = map(split_by_colon, lines)

>>> {int(key): value for key, value in key_value_pairs}

{41: 'n',

 43: 'n',

 ...

 58: 'n'}

Also, if you don't want to keep in memory a list of results from data.split, you might find this helpful: Is there a generator version of string.split() in Python?

answered 13 hours ago

Georgy

1,0462520

You have too much logic in the dict comprehension:

{int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}

First of all, let's expand it to a normal for-loop:

>>> result = {}

>>> for line in data.split("n"):

...     if len(line.split(":"))==2:

...         result[int(line.split(":")[0])] = line.split(":")[1]

>>> result

I can see that you use the following check if len(line.split(":"))==2: to eliminate the first blank space from the data.split("n"):

>>> data.split("n")

['',

 '41:n',

 '43:n',

 ...

 '58:n']

But the docs for str.split advice to use str.split() without specifying a sep parameter if you wanna discard the empty string at the beginning:

>>> data.split()

['41:n',

 '43:n',

 ...

 '58:n']

So, now we can remove unnecessary check from your code:

>>> result = {}

>>> for line in data.split():

...     result[int(line.split(":")[0])] = line.split(":")[1]

>>> result

Here you calculate line.split(":") twice. Take it out:

>>> result = {}

>>> for line in data.split():

...    key, value = line.split(":")

...    result[int(key)] = value

>>> result

This is the most basic version. Don't put it back to a dict comprehension as it will still look quite complex. But you could make a function out of it. For example, something like this:

>>> def to_key_value(line, sep=':'):

...     key, value = line.split(sep)

...     return int(key), value



>>> dict(map(to_key_value, data.split()))

{41: 'n',

 43: 'n',

 ...

 58: 'n'}

Another option that I came up with:

>>> from functools import partial

>>> lines = data.split()

>>> split_by_colon = partial(str.split, sep=':')

>>> key_value_pairs = map(split_by_colon, lines)

>>> {int(key): value for key, value in key_value_pairs}

{41: 'n',

 43: 'n',

 ...

 58: 'n'}

Also, if you don't want to keep in memory a list of results from data.split, you might find this helpful: Is there a generator version of string.split() in Python?

answered 13 hours ago

Georgy

1,0462520

answered 13 hours ago

Georgy

1,0462520

answered 13 hours ago

Georgy

1,0462520

answered 13 hours ago

Georgy

1,0462520

$begingroup$
I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
$endgroup$
– Rahul Patel
13 hours ago

$begingroup$
@RahulPatel: You might want to learn constructive criticism and some diplomacy ;) Georgy was nice enough to spend time on your problem...
$endgroup$
– Eric Duminil
4 hours ago

add a comment |

$begingroup$
I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
$endgroup$
– Rahul Patel
13 hours ago

$begingroup$
@RahulPatel: You might want to learn constructive criticism and some diplomacy ;) Georgy was nice enough to spend time on your problem...
$endgroup$
– Eric Duminil
4 hours ago

I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.

– Rahul Patel
13 hours ago

@RahulPatel: You might want to learn constructive criticism and some diplomacy ;) Georgy was nice enough to spend time on your problem...

– Eric Duminil
4 hours ago

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Code Review Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ggthjy