Skip to content

datajoint/table.py: smarter dataframe conversion (#666)#776

Merged
eywalker merged 5 commits into
datajoint:masterfrom
ixcat:issue-666
May 15, 2020
Merged

datajoint/table.py: smarter dataframe conversion (#666)#776
eywalker merged 5 commits into
datajoint:masterfrom
ixcat:issue-666

Conversation

@ixcat

@ixcat ixcat commented Apr 30, 2020

Copy link
Copy Markdown

to address #666

needs further discussion/confirmation with those more familiar with pandas

@ixcat

ixcat commented May 1, 2020

Copy link
Copy Markdown
Author

@dimitri-yatsenko / @eywalker / @guzman-raphael -
Would be good to get some feedback here before un-WIP, my familiarity with pandas is a bit lacking to feel comfortable saying 'this is the fix'. See also example test script issue666.txt in #666 comments.

@neuralsignal neuralsignal left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use not isinstance(rows.index, pd.MultiIndex), since the dataframe fetched from datajoint are always multiindex and it is possible that users pass dataframes with a pd.Int64Index for example depending how they constructed the dataframe. Another option would be to test if the index names correspond to the primary keys, such as: set(self.primary_key) == set(rows.index.names)

@ixcat

ixcat commented May 4, 2020

Copy link
Copy Markdown
Author

Thanks for the feedback @gucky92
Just to make sure I've got it - these both refer to passing the 'drop' argument to reset_index, correct? e.g. you're proposing either:

  • rows.reset_index(drop=not isinstance(rows.index, pandas.MultiIndex)).to_records(index=False)
    or:
  • rows.reset_index(drop=set(self.primary_key) != set(rows.index.names)).to_records(index=False)

correct?

@neuralsignal

Copy link
Copy Markdown
Contributor

@ixcat Yes, that is what i meant.

@guzman-raphael guzman-raphael marked this pull request as draft May 7, 2020 19:54
@eywalker eywalker changed the title [WIP] datajoint/table.py: smarter dataframe conversion (#666) datajoint/table.py: smarter dataframe conversion (#666) May 15, 2020
@eywalker

Copy link
Copy Markdown
Contributor

Simpler and more in alignment with what we actually want to check is whether there is any "named" index/indices. Given this, a good check would be

rows.reset_index(drop=len(rows.index.names)==1 and not rows.index.names[0]).to_records(index=False)

The default case of an unnamed index will have rows.index.names with single entry of None and can be safely dropped.

Comment thread datajoint/table.py
@ixcat ixcat changed the title datajoint/table.py: smarter dataframe conversion (#666) [WIP] datajoint/table.py: smarter dataframe conversion (#666) May 15, 2020
@ixcat

ixcat commented May 15, 2020

Copy link
Copy Markdown
Author

back to 'WIP' until unit test

@eywalker

Copy link
Copy Markdown
Contributor

Ah sorry I removed WIP as this was set as Draft and figured that suffice to indicate the WIP status.

@ixcat ixcat changed the title [WIP] datajoint/table.py: smarter dataframe conversion (#666) datajoint/table.py: smarter dataframe conversion (#666) May 15, 2020
@ixcat ixcat marked this pull request as ready for review May 15, 2020 20:18
@guzman-raphael guzman-raphael requested a review from eywalker May 15, 2020 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants