Pandas: How to group by a value in column when there is list in one of the columnsHow to make a flat list out...

If I sold a PS4 game I owned the disc for, can I reinstall it digitally?

Process to change collation on a database

Disable the ">" operator in Rstudio linux terminal

Explain the objections to these measures against human trafficking

Quenching swords in dragon blood; why?

Solving Fredholm Equation of the second kind

Are there neural networks with very few nodes that decently solve non-trivial problems?

Do authors have to be politically correct in article-writing?

Why doesn't "auto ch = unsigned char{'p'}" compile under C++ 17?

What makes the Forgotten Realms "forgotten"?

Why don't American passenger airlines operate dedicated cargo flights any more?

Lick explanation

What is better: yes / no radio, or simple checkbox?

Why did this image turn out darker?

Why is "points exist" not an axiom in geometry?

Can a person refuse a presidential pardon?

A minimum of two personnel "are" or "is"?

Why does a metal block make a shrill sound but not a wooden block upon hammering?

How would one buy a used TIE Fighter or X-Wing?

Why are the books in the Game of Thrones citadel library shelved spine inwards?

Can an insurance company drop you after receiving a bill and refusing to pay?

A starship is travelling at 0.9c and collides with a small rock. Will it leave a clean hole through, or will more happen?

Using only 1s, make 29 with the minimum number of digits

Is there some relative to Dutch word "kijken" in German?

Pandas: How to group by a value in column when there is list in one of the columns

How to make a flat list out of list of lists?How do I check if a list is empty?How do I sort a dictionary by value?How to make a flat list out of list of lists?How to concatenate two lists in Python?How to clone or copy a list?How do I list all files of a directory?Renaming columns in pandasDelete column from pandas DataFrame by column nameSelect rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers

I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.

Dataframe:

 value_1:        value_2:           value_3:               list: 

 american     california, nyc      walmart, kmart      [supermarket, connivence] 

 canadian         toronto            dunkinDonuts      [coffee]

 american          texas                               [state]

 canadian                             walmart          [supermarket] 

   ...              ...                 ...              ....

My expected output is:

value_1:        value_2:              value_3:             list: 

american   california, nyc, texas   walmart, kmart      [supermarket, connivence, state] 

canadian         toronto         dunkinDonuts, walmart  [coffee, supermarket]

Thanks!

asked 16 hours ago

johnJones901

764

New contributor

There are all strings and one list column?

– jezrael
16 hours ago

Super, and if use print (df.iloc[0].apply(type)) ?

– jezrael
16 hours ago

OK, so both solution working.

– jezrael
16 hours ago

add a comment |

I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.

Dataframe:

 value_1:        value_2:           value_3:               list: 

 american     california, nyc      walmart, kmart      [supermarket, connivence] 

 canadian         toronto            dunkinDonuts      [coffee]

 american          texas                               [state]

 canadian                             walmart          [supermarket] 

   ...              ...                 ...              ....

My expected output is:

value_1:        value_2:              value_3:             list: 

american   california, nyc, texas   walmart, kmart      [supermarket, connivence, state] 

canadian         toronto         dunkinDonuts, walmart  [coffee, supermarket]

Thanks!

asked 16 hours ago

johnJones901

764

New contributor

There are all strings and one list column?

– jezrael
16 hours ago

Super, and if use print (df.iloc[0].apply(type)) ?

– jezrael
16 hours ago

OK, so both solution working.

– jezrael
16 hours ago

add a comment |

I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.

Dataframe:

 value_1:        value_2:           value_3:               list: 

 american     california, nyc      walmart, kmart      [supermarket, connivence] 

 canadian         toronto            dunkinDonuts      [coffee]

 american          texas                               [state]

 canadian                             walmart          [supermarket] 

   ...              ...                 ...              ....

My expected output is:

value_1:        value_2:              value_3:             list: 

american   california, nyc, texas   walmart, kmart      [supermarket, connivence, state] 

canadian         toronto         dunkinDonuts, walmart  [coffee, supermarket]

Thanks!

asked 16 hours ago

johnJones901

764

New contributor

I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.

Dataframe:

 value_1:        value_2:           value_3:               list: 

 american     california, nyc      walmart, kmart      [supermarket, connivence] 

 canadian         toronto            dunkinDonuts      [coffee]

 american          texas                               [state]

 canadian                             walmart          [supermarket] 

   ...              ...                 ...              ....

My expected output is:

value_1:        value_2:              value_3:             list: 

american   california, nyc, texas   walmart, kmart      [supermarket, connivence, state] 

canadian         toronto         dunkinDonuts, walmart  [coffee, supermarket]

Thanks!

python pandas

asked 16 hours ago

johnJones901

764

New contributor

asked 16 hours ago

johnJones901

764

New contributor

asked 16 hours ago

johnJones901

764

New contributor

asked 16 hours ago

johnJones901

764

asked 16 hours ago

johnJones901

764

New contributor

johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

There are all strings and one list column?

– jezrael
16 hours ago

Super, and if use print (df.iloc[0].apply(type)) ?

– jezrael
16 hours ago

OK, so both solution working.

– jezrael
16 hours ago

add a comment |

There are all strings and one list column?

– jezrael
16 hours ago

Super, and if use print (df.iloc[0].apply(type)) ?

– jezrael
16 hours ago

OK, so both solution working.

– jezrael
16 hours ago

There are all strings and one list column?

– jezrael
16 hours ago

Super, and if use print (df.iloc[0].apply(type)) ?

– jezrael
16 hours ago

OK, so both solution working.

– jezrael
16 hours ago

add a comment |

2 Answers
2

active

oldest

votes

Create dynamically dictionary by all columns with no list and value_1 and for list use lambda function with list comprehension with flatenning:

f1 = lambda x: ', '.join(x.dropna())

#alternative for join only strings

#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

f2 = lambda x: [z for y in x for z in y]

d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)

d['list'] = f2 



df = df.groupby('value_1', as_index=False).agg(d)

print (df)

     value_1                 value_2                value_3  

0   american  california, nyc, texas         walmart, kmart   

1   canadian                 toronto  dunkinDonuts, walmart   



                               list  

0  [supermarket, connivence, state]  

1             [coffee, supermarket]

Explanation:

f1 and f2 are lambda functions.

First remove missing values (if exist) and join strings with separator:

f1 = lambda x: ', '.join(x.dropna())

First get only strings values (omit missing values, because NaNs) and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

First get all string values with filtering empty strings and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if y != ''])

Function f2 is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]

f2 = lambda x: [z for y in x for z in y]

edited 15 hours ago

answered 16 hours ago

jezrael

342k25297369

@johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

– jezrael
16 hours ago

1

@johnJones901 - Answer was edited.

– jezrael
15 hours ago

@johnJones901 - You are welcome!

– jezrael
15 hours ago

add a comment |

You could groupby value_1 and aggregate with the following function for the strings:

def fun(x):

    return x.str.cat(sep=', ')

And use GroupBy.sum to append the lists in the column list:

df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})



                        list                       value_2  

value_1                                                              

american  [supermarket, connivence, state]  california, nyc, texas   

canadian             [coffee, sipermarket]          toronto, texas   



                    value_3  

value_1                                 

american  walmart, kmart, dunkinDonuts  

canadian         dunkinDonuts, walmart

edited 14 hours ago

answered 16 hours ago

yatu

11.7k31238

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54944344%2fpandas-how-to-group-by-a-value-in-column-when-there-is-list-in-one-of-the-colum%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

Create dynamically dictionary by all columns with no list and value_1 and for list use lambda function with list comprehension with flatenning:

f1 = lambda x: ', '.join(x.dropna())

#alternative for join only strings

#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

f2 = lambda x: [z for y in x for z in y]

d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)

d['list'] = f2 



df = df.groupby('value_1', as_index=False).agg(d)

print (df)

     value_1                 value_2                value_3  

0   american  california, nyc, texas         walmart, kmart   

1   canadian                 toronto  dunkinDonuts, walmart   



                               list  

0  [supermarket, connivence, state]  

1             [coffee, supermarket]

Explanation:

f1 and f2 are lambda functions.

First remove missing values (if exist) and join strings with separator:

f1 = lambda x: ', '.join(x.dropna())

First get only strings values (omit missing values, because NaNs) and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

First get all string values with filtering empty strings and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if y != ''])

Function f2 is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]

f2 = lambda x: [z for y in x for z in y]

edited 15 hours ago

answered 16 hours ago

jezrael

342k25297369

@johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

– jezrael
16 hours ago

1

@johnJones901 - Answer was edited.

– jezrael
15 hours ago

@johnJones901 - You are welcome!

– jezrael
15 hours ago

add a comment |

Create dynamically dictionary by all columns with no list and value_1 and for list use lambda function with list comprehension with flatenning:

f1 = lambda x: ', '.join(x.dropna())

#alternative for join only strings

#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

f2 = lambda x: [z for y in x for z in y]

d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)

d['list'] = f2 



df = df.groupby('value_1', as_index=False).agg(d)

print (df)

     value_1                 value_2                value_3  

0   american  california, nyc, texas         walmart, kmart   

1   canadian                 toronto  dunkinDonuts, walmart   



                               list  

0  [supermarket, connivence, state]  

1             [coffee, supermarket]

Explanation:

f1 and f2 are lambda functions.

First remove missing values (if exist) and join strings with separator:

f1 = lambda x: ', '.join(x.dropna())

First get only strings values (omit missing values, because NaNs) and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

First get all string values with filtering empty strings and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if y != ''])

Function f2 is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]

f2 = lambda x: [z for y in x for z in y]

edited 15 hours ago

answered 16 hours ago

jezrael

342k25297369

@johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

– jezrael
16 hours ago

1

@johnJones901 - Answer was edited.

– jezrael
15 hours ago

@johnJones901 - You are welcome!

– jezrael
15 hours ago

add a comment |

Create dynamically dictionary by all columns with no list and value_1 and for list use lambda function with list comprehension with flatenning:

f1 = lambda x: ', '.join(x.dropna())

#alternative for join only strings

#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

f2 = lambda x: [z for y in x for z in y]

d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)

d['list'] = f2 



df = df.groupby('value_1', as_index=False).agg(d)

print (df)

     value_1                 value_2                value_3  

0   american  california, nyc, texas         walmart, kmart   

1   canadian                 toronto  dunkinDonuts, walmart   



                               list  

0  [supermarket, connivence, state]  

1             [coffee, supermarket]

Explanation:

f1 and f2 are lambda functions.

First remove missing values (if exist) and join strings with separator:

f1 = lambda x: ', '.join(x.dropna())

First get only strings values (omit missing values, because NaNs) and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

First get all string values with filtering empty strings and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if y != ''])

Function f2 is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]

f2 = lambda x: [z for y in x for z in y]

edited 15 hours ago

answered 16 hours ago

jezrael

342k25297369

Create dynamically dictionary by all columns with no list and value_1 and for list use lambda function with list comprehension with flatenning:

f1 = lambda x: ', '.join(x.dropna())

#alternative for join only strings

#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

f2 = lambda x: [z for y in x for z in y]

d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)

d['list'] = f2 



df = df.groupby('value_1', as_index=False).agg(d)

print (df)

     value_1                 value_2                value_3  

0   american  california, nyc, texas         walmart, kmart   

1   canadian                 toronto  dunkinDonuts, walmart   



                               list  

0  [supermarket, connivence, state]  

1             [coffee, supermarket]

Explanation:

f1 and f2 are lambda functions.

First remove missing values (if exist) and join strings with separator:

f1 = lambda x: ', '.join(x.dropna())

First get only strings values (omit missing values, because NaNs) and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

First get all string values with filtering empty strings and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if y != ''])

Function f2 is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]

f2 = lambda x: [z for y in x for z in y]

edited 15 hours ago

answered 16 hours ago

jezrael

342k25297369

edited 15 hours ago

answered 16 hours ago

jezrael

342k25297369

answered 16 hours ago

jezrael

342k25297369

answered 16 hours ago

jezrael

342k25297369

@johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

– jezrael
16 hours ago

1

@johnJones901 - Answer was edited.

– jezrael
15 hours ago

@johnJones901 - You are welcome!

– jezrael
15 hours ago

add a comment |

@johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

– jezrael
16 hours ago

1

@johnJones901 - Answer was edited.

– jezrael
15 hours ago

@johnJones901 - You are welcome!

– jezrael
15 hours ago

@johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

– jezrael
16 hours ago

@johnJones901 - Answer was edited.

– jezrael
15 hours ago

@johnJones901 - You are welcome!

– jezrael
15 hours ago

add a comment |

You could groupby value_1 and aggregate with the following function for the strings:

def fun(x):

    return x.str.cat(sep=', ')

And use GroupBy.sum to append the lists in the column list:

df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})



                        list                       value_2  

value_1                                                              

american  [supermarket, connivence, state]  california, nyc, texas   

canadian             [coffee, sipermarket]          toronto, texas   



                    value_3  

value_1                                 

american  walmart, kmart, dunkinDonuts  

canadian         dunkinDonuts, walmart

edited 14 hours ago

answered 16 hours ago

yatu

11.7k31238

add a comment |

You could groupby value_1 and aggregate with the following function for the strings:

def fun(x):

    return x.str.cat(sep=', ')

And use GroupBy.sum to append the lists in the column list:

df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})



                        list                       value_2  

value_1                                                              

american  [supermarket, connivence, state]  california, nyc, texas   

canadian             [coffee, sipermarket]          toronto, texas   



                    value_3  

value_1                                 

american  walmart, kmart, dunkinDonuts  

canadian         dunkinDonuts, walmart

edited 14 hours ago

answered 16 hours ago

yatu

11.7k31238

add a comment |

You could groupby value_1 and aggregate with the following function for the strings:

def fun(x):

    return x.str.cat(sep=', ')

And use GroupBy.sum to append the lists in the column list:

df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})



                        list                       value_2  

value_1                                                              

american  [supermarket, connivence, state]  california, nyc, texas   

canadian             [coffee, sipermarket]          toronto, texas   



                    value_3  

value_1                                 

american  walmart, kmart, dunkinDonuts  

canadian         dunkinDonuts, walmart

edited 14 hours ago

answered 16 hours ago

yatu

11.7k31238

You could groupby value_1 and aggregate with the following function for the strings:

def fun(x):

    return x.str.cat(sep=', ')

And use GroupBy.sum to append the lists in the column list:

df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})



                        list                       value_2  

value_1                                                              

american  [supermarket, connivence, state]  california, nyc, texas   

canadian             [coffee, sipermarket]          toronto, texas   



                    value_3  

value_1                                 

american  walmart, kmart, dunkinDonuts  

canadian         dunkinDonuts, walmart

edited 14 hours ago

answered 16 hours ago

yatu

11.7k31238

edited 14 hours ago

answered 16 hours ago

yatu

11.7k31238

answered 16 hours ago

yatu

11.7k31238

answered 16 hours ago

yatu

11.7k31238

add a comment |

johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ggthjy