Skip to content

Conversation

@galz10
Copy link
Collaborator

@galz10 galz10 commented Oct 4, 2022

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

BEGIN_COMMIT_OVERRIDE
chore: wrapped tables
END_COMMIT_OVERRIDE

@galz10 galz10 requested a review from a team as a code owner October 4, 2022 22:45
@product-auto-label product-auto-label bot added the size: l Pull request size is large. label Oct 4, 2022
@galz10 galz10 requested a review from dizcology October 10, 2022 22:08
@galz10 galz10 changed the title feat: add TableWrapper and helper functions feat: wrapped tables Oct 12, 2022
@galz10 galz10 requested a review from dizcology October 12, 2022 19:45
@galz10 galz10 added size: m Pull request size is medium. and removed size: l Pull request size is large. labels Oct 12, 2022
_documentai_table: documentai.Document.Page.Table = dataclasses.field(
init=True, repr=False
)
body_rows: List[List[str]] = dataclasses.field(init=True, repr=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The body rows and header rows perhaps could be cached properties and extracted only when called. Or alternatively, populated in __post_init__ (calling some of the private helper functions extracting rows from documentai.Table).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would be the benefit of populating it in __post_init__

@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels Oct 13, 2022
@galz10 galz10 requested a review from dizcology October 13, 2022 22:17
paragraphs = []
tables = []

for line in documentai_page.lines:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code for populating lines, paragraphs, and tables could go into post_init, since then the user could also instantiate a wrapped_page by just passing in a documentai_page message (useful when we support in-memory documents).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe we can support both, since populating the things in post_init would require me to remove them from init. Also there would be no difference between from_documentai_page and using DocumentWrapper()

@galz10 galz10 requested a review from dizcology October 17, 2022 20:30
@galz10 galz10 merged commit 9e4e367 into main Oct 18, 2022
@galz10 galz10 deleted the wrap-table branch October 18, 2022 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size: l Pull request size is large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants