Pandas: How to group by a value in column when there is list in one of the columnsHow to make a flat list out...
If I sold a PS4 game I owned the disc for, can I reinstall it digitally?
Process to change collation on a database
Disable the ">" operator in Rstudio linux terminal
Explain the objections to these measures against human trafficking
Quenching swords in dragon blood; why?
Solving Fredholm Equation of the second kind
Are there neural networks with very few nodes that decently solve non-trivial problems?
Do authors have to be politically correct in article-writing?
Why doesn't "auto ch = unsigned char{'p'}" compile under C++ 17?
What makes the Forgotten Realms "forgotten"?
Why don't American passenger airlines operate dedicated cargo flights any more?
Lick explanation
What is better: yes / no radio, or simple checkbox?
Why did this image turn out darker?
Why is "points exist" not an axiom in geometry?
Can a person refuse a presidential pardon?
A minimum of two personnel "are" or "is"?
Why does a metal block make a shrill sound but not a wooden block upon hammering?
How would one buy a used TIE Fighter or X-Wing?
Why are the books in the Game of Thrones citadel library shelved spine inwards?
Can an insurance company drop you after receiving a bill and refusing to pay?
A starship is travelling at 0.9c and collides with a small rock. Will it leave a clean hole through, or will more happen?
Using only 1s, make 29 with the minimum number of digits
Is there some relative to Dutch word "kijken" in German?
Pandas: How to group by a value in column when there is list in one of the columns
How to make a flat list out of list of lists?How do I check if a list is empty?How do I sort a dictionary by value?How to make a flat list out of list of lists?How to concatenate two lists in Python?How to clone or copy a list?How do I list all files of a directory?Renaming columns in pandasDelete column from pandas DataFrame by column nameSelect rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers
I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.
Dataframe:
value_1: value_2: value_3: list:
american california, nyc walmart, kmart [supermarket, connivence]
canadian toronto dunkinDonuts [coffee]
american texas [state]
canadian walmart [supermarket]
... ... ... ....
My expected output is:
value_1: value_2: value_3: list:
american california, nyc, texas walmart, kmart [supermarket, connivence, state]
canadian toronto dunkinDonuts, walmart [coffee, supermarket]
Thanks!
python pandas
New contributor
add a comment |
I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.
Dataframe:
value_1: value_2: value_3: list:
american california, nyc walmart, kmart [supermarket, connivence]
canadian toronto dunkinDonuts [coffee]
american texas [state]
canadian walmart [supermarket]
... ... ... ....
My expected output is:
value_1: value_2: value_3: list:
american california, nyc, texas walmart, kmart [supermarket, connivence, state]
canadian toronto dunkinDonuts, walmart [coffee, supermarket]
Thanks!
python pandas
New contributor
There are all strings and one list column?
– jezrael
16 hours ago
Super, and if useprint (df.iloc[0].apply(type))
?
– jezrael
16 hours ago
OK, so both solution working.
– jezrael
16 hours ago
add a comment |
I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.
Dataframe:
value_1: value_2: value_3: list:
american california, nyc walmart, kmart [supermarket, connivence]
canadian toronto dunkinDonuts [coffee]
american texas [state]
canadian walmart [supermarket]
... ... ... ....
My expected output is:
value_1: value_2: value_3: list:
american california, nyc, texas walmart, kmart [supermarket, connivence, state]
canadian toronto dunkinDonuts, walmart [coffee, supermarket]
Thanks!
python pandas
New contributor
I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.
Dataframe:
value_1: value_2: value_3: list:
american california, nyc walmart, kmart [supermarket, connivence]
canadian toronto dunkinDonuts [coffee]
american texas [state]
canadian walmart [supermarket]
... ... ... ....
My expected output is:
value_1: value_2: value_3: list:
american california, nyc, texas walmart, kmart [supermarket, connivence, state]
canadian toronto dunkinDonuts, walmart [coffee, supermarket]
Thanks!
python pandas
python pandas
New contributor
New contributor
New contributor
asked 16 hours ago
johnJones901johnJones901
764
764
New contributor
New contributor
There are all strings and one list column?
– jezrael
16 hours ago
Super, and if useprint (df.iloc[0].apply(type))
?
– jezrael
16 hours ago
OK, so both solution working.
– jezrael
16 hours ago
add a comment |
There are all strings and one list column?
– jezrael
16 hours ago
Super, and if useprint (df.iloc[0].apply(type))
?
– jezrael
16 hours ago
OK, so both solution working.
– jezrael
16 hours ago
There are all strings and one list column?
– jezrael
16 hours ago
There are all strings and one list column?
– jezrael
16 hours ago
Super, and if use
print (df.iloc[0].apply(type))
?– jezrael
16 hours ago
Super, and if use
print (df.iloc[0].apply(type))
?– jezrael
16 hours ago
OK, so both solution working.
– jezrael
16 hours ago
OK, so both solution working.
– jezrael
16 hours ago
add a comment |
2 Answers
2
active
oldest
votes
Create dynamically dictionary by all columns with no list
and value_1
and for list
use lambda function with list comprehension with flatenning:
f1 = lambda x: ', '.join(x.dropna())
#alternative for join only strings
#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
f2 = lambda x: [z for y in x for z in y]
d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
d['list'] = f2
df = df.groupby('value_1', as_index=False).agg(d)
print (df)
value_1 value_2 value_3
0 american california, nyc, texas walmart, kmart
1 canadian toronto dunkinDonuts, walmart
list
0 [supermarket, connivence, state]
1 [coffee, supermarket]
Explanation:
f1
and f2
are lambda functions.
First remove missing values (if exist) and join
strings with separator:
f1 = lambda x: ', '.join(x.dropna())
First get only strings values (omit missing values, because NaN
s) and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
First get all string values with filtering empty strings and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if y != ''])
Function f2
is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]
f2 = lambda x: [z for y in x for z in y]
@johnJones901 - Can you check changef1
tof1 = lambda x: ', '.join([y for y in x if y != ''])
?
– jezrael
16 hours ago
1
@johnJones901 - Answer was edited.
– jezrael
15 hours ago
@johnJones901 - You are welcome!
– jezrael
15 hours ago
add a comment |
You could groupby
value_1
and aggregate with the following function for the strings:
def fun(x):
return x.str.cat(sep=', ')
And use GroupBy.sum
to append the lists in the column list
:
df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})
list value_2
value_1
american [supermarket, connivence, state] california, nyc, texas
canadian [coffee, sipermarket] toronto, texas
value_3
value_1
american walmart, kmart, dunkinDonuts
canadian dunkinDonuts, walmart
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54944344%2fpandas-how-to-group-by-a-value-in-column-when-there-is-list-in-one-of-the-colum%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Create dynamically dictionary by all columns with no list
and value_1
and for list
use lambda function with list comprehension with flatenning:
f1 = lambda x: ', '.join(x.dropna())
#alternative for join only strings
#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
f2 = lambda x: [z for y in x for z in y]
d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
d['list'] = f2
df = df.groupby('value_1', as_index=False).agg(d)
print (df)
value_1 value_2 value_3
0 american california, nyc, texas walmart, kmart
1 canadian toronto dunkinDonuts, walmart
list
0 [supermarket, connivence, state]
1 [coffee, supermarket]
Explanation:
f1
and f2
are lambda functions.
First remove missing values (if exist) and join
strings with separator:
f1 = lambda x: ', '.join(x.dropna())
First get only strings values (omit missing values, because NaN
s) and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
First get all string values with filtering empty strings and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if y != ''])
Function f2
is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]
f2 = lambda x: [z for y in x for z in y]
@johnJones901 - Can you check changef1
tof1 = lambda x: ', '.join([y for y in x if y != ''])
?
– jezrael
16 hours ago
1
@johnJones901 - Answer was edited.
– jezrael
15 hours ago
@johnJones901 - You are welcome!
– jezrael
15 hours ago
add a comment |
Create dynamically dictionary by all columns with no list
and value_1
and for list
use lambda function with list comprehension with flatenning:
f1 = lambda x: ', '.join(x.dropna())
#alternative for join only strings
#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
f2 = lambda x: [z for y in x for z in y]
d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
d['list'] = f2
df = df.groupby('value_1', as_index=False).agg(d)
print (df)
value_1 value_2 value_3
0 american california, nyc, texas walmart, kmart
1 canadian toronto dunkinDonuts, walmart
list
0 [supermarket, connivence, state]
1 [coffee, supermarket]
Explanation:
f1
and f2
are lambda functions.
First remove missing values (if exist) and join
strings with separator:
f1 = lambda x: ', '.join(x.dropna())
First get only strings values (omit missing values, because NaN
s) and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
First get all string values with filtering empty strings and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if y != ''])
Function f2
is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]
f2 = lambda x: [z for y in x for z in y]
@johnJones901 - Can you check changef1
tof1 = lambda x: ', '.join([y for y in x if y != ''])
?
– jezrael
16 hours ago
1
@johnJones901 - Answer was edited.
– jezrael
15 hours ago
@johnJones901 - You are welcome!
– jezrael
15 hours ago
add a comment |
Create dynamically dictionary by all columns with no list
and value_1
and for list
use lambda function with list comprehension with flatenning:
f1 = lambda x: ', '.join(x.dropna())
#alternative for join only strings
#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
f2 = lambda x: [z for y in x for z in y]
d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
d['list'] = f2
df = df.groupby('value_1', as_index=False).agg(d)
print (df)
value_1 value_2 value_3
0 american california, nyc, texas walmart, kmart
1 canadian toronto dunkinDonuts, walmart
list
0 [supermarket, connivence, state]
1 [coffee, supermarket]
Explanation:
f1
and f2
are lambda functions.
First remove missing values (if exist) and join
strings with separator:
f1 = lambda x: ', '.join(x.dropna())
First get only strings values (omit missing values, because NaN
s) and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
First get all string values with filtering empty strings and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if y != ''])
Function f2
is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]
f2 = lambda x: [z for y in x for z in y]
Create dynamically dictionary by all columns with no list
and value_1
and for list
use lambda function with list comprehension with flatenning:
f1 = lambda x: ', '.join(x.dropna())
#alternative for join only strings
#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
f2 = lambda x: [z for y in x for z in y]
d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
d['list'] = f2
df = df.groupby('value_1', as_index=False).agg(d)
print (df)
value_1 value_2 value_3
0 american california, nyc, texas walmart, kmart
1 canadian toronto dunkinDonuts, walmart
list
0 [supermarket, connivence, state]
1 [coffee, supermarket]
Explanation:
f1
and f2
are lambda functions.
First remove missing values (if exist) and join
strings with separator:
f1 = lambda x: ', '.join(x.dropna())
First get only strings values (omit missing values, because NaN
s) and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
First get all string values with filtering empty strings and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if y != ''])
Function f2
is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]
f2 = lambda x: [z for y in x for z in y]
edited 15 hours ago
answered 16 hours ago
jezraeljezrael
342k25297369
342k25297369
@johnJones901 - Can you check changef1
tof1 = lambda x: ', '.join([y for y in x if y != ''])
?
– jezrael
16 hours ago
1
@johnJones901 - Answer was edited.
– jezrael
15 hours ago
@johnJones901 - You are welcome!
– jezrael
15 hours ago
add a comment |
@johnJones901 - Can you check changef1
tof1 = lambda x: ', '.join([y for y in x if y != ''])
?
– jezrael
16 hours ago
1
@johnJones901 - Answer was edited.
– jezrael
15 hours ago
@johnJones901 - You are welcome!
– jezrael
15 hours ago
@johnJones901 - Can you check change
f1
to f1 = lambda x: ', '.join([y for y in x if y != ''])
?– jezrael
16 hours ago
@johnJones901 - Can you check change
f1
to f1 = lambda x: ', '.join([y for y in x if y != ''])
?– jezrael
16 hours ago
1
1
@johnJones901 - Answer was edited.
– jezrael
15 hours ago
@johnJones901 - Answer was edited.
– jezrael
15 hours ago
@johnJones901 - You are welcome!
– jezrael
15 hours ago
@johnJones901 - You are welcome!
– jezrael
15 hours ago
add a comment |
You could groupby
value_1
and aggregate with the following function for the strings:
def fun(x):
return x.str.cat(sep=', ')
And use GroupBy.sum
to append the lists in the column list
:
df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})
list value_2
value_1
american [supermarket, connivence, state] california, nyc, texas
canadian [coffee, sipermarket] toronto, texas
value_3
value_1
american walmart, kmart, dunkinDonuts
canadian dunkinDonuts, walmart
add a comment |
You could groupby
value_1
and aggregate with the following function for the strings:
def fun(x):
return x.str.cat(sep=', ')
And use GroupBy.sum
to append the lists in the column list
:
df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})
list value_2
value_1
american [supermarket, connivence, state] california, nyc, texas
canadian [coffee, sipermarket] toronto, texas
value_3
value_1
american walmart, kmart, dunkinDonuts
canadian dunkinDonuts, walmart
add a comment |
You could groupby
value_1
and aggregate with the following function for the strings:
def fun(x):
return x.str.cat(sep=', ')
And use GroupBy.sum
to append the lists in the column list
:
df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})
list value_2
value_1
american [supermarket, connivence, state] california, nyc, texas
canadian [coffee, sipermarket] toronto, texas
value_3
value_1
american walmart, kmart, dunkinDonuts
canadian dunkinDonuts, walmart
You could groupby
value_1
and aggregate with the following function for the strings:
def fun(x):
return x.str.cat(sep=', ')
And use GroupBy.sum
to append the lists in the column list
:
df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})
list value_2
value_1
american [supermarket, connivence, state] california, nyc, texas
canadian [coffee, sipermarket] toronto, texas
value_3
value_1
american walmart, kmart, dunkinDonuts
canadian dunkinDonuts, walmart
edited 14 hours ago
answered 16 hours ago
yatuyatu
11.7k31238
11.7k31238
add a comment |
add a comment |
johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.
johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.
johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.
johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54944344%2fpandas-how-to-group-by-a-value-in-column-when-there-is-list-in-one-of-the-colum%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
There are all strings and one list column?
– jezrael
16 hours ago
Super, and if use
print (df.iloc[0].apply(type))
?– jezrael
16 hours ago
OK, so both solution working.
– jezrael
16 hours ago