Question:

How .transform handle the splitted groups?

Solution:

.transform does not handle the split groups, however, it does run some checks to see the type of output. You use .transform to return something of a similar shape as the input rather than using it for its side effects (you should use> apply as shown later).

A custom function might make the following more clear:

def f(group):

print('---')

print(group.name) # with `transform` this shouldn't give the group name

print(group)

print('===')

df.groupby('subject').transform(f)

Output:

--- # first group

level

0 hard

1 None

Name: level, dtype: object

===

--- # internal pandas check (not a real group)

level

0 hard

1 None

===

--- # second group

level

2 None

3 easy

Name: level, dtype: object

===

--- # third group

level

4 None

Name: level, dtype: object

===

--- # fourth group

level

5 medium

Name: level, dtype: object

===

In comparison, using> apply does give the group names that you can use for this kind of operation:

df.groupby('subject').apply(f)

---

subject level

0 a hard

1 a None

===

---

subject level

2 b None

3 b easy

===

---

subject level

4 c None

===

---

subject level

5 d medium

===

Avoid using .transform to manually work on groups.

The below code is another example. The current Series name is returned by group .name in transform; observe what occurs when multiple columns are used.:

df = pd.DataFrame({'subject': ['a', 'a', 'b', 'b', 'c', 'd'],

'level': ['hard', None, None, 'easy', None, 'medium'],

'level2': ['hard', None, None, 'easy', None, 'medium']

})

df.groupby('subject').transform(lambda g: print(g.name))

level # first group, column "level"

level2 # first group, column "level2"

a # some internal check run only once

level # second group, column "level"

level2 # second group, column "level2"

level # etc.

level2