fix: use flexible datetime parsing for start_date in file-based connectors#887
Conversation
…ctors Replace strict regex pattern with ab_datetime_try_parse validator to accept any valid ISO8601/RFC3339 datetime format. This fixes issues where valid datetime strings like '2025-01-01T00:00:00Z' (without microseconds) were incorrectly rejected. Fixes: airbytehq/oncall#9390 Co-Authored-By: AJ Steers <aj@airbyte.io>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. 💡 Show Tips and TricksTesting This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@devin/1769813348-fix-start-date-pattern#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch devin/1769813348-fix-start-date-patternPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
Co-Authored-By: AJ Steers <aj@airbyte.io>
Co-Authored-By: AJ Steers <aj@airbyte.io>
Co-Authored-By: AJ Steers <aj@airbyte.io>
Co-Authored-By: AJ Steers <aj@airbyte.io>
Co-Authored-By: AJ Steers <aj@airbyte.io>
Co-Authored-By: AJ Steers <aj@airbyte.io>
There was a problem hiding this comment.
Pull request overview
This PR replaces strict regex validation for start_date in file-based connectors with a flexible Pydantic validator using ab_datetime_try_parse. This allows valid ISO8601/RFC3339 datetime formats (like 2025-01-01T00:00:00Z) that were previously rejected due to the requirement of exactly 6 microsecond digits.
Changes:
- Added a Pydantic validator to
AbstractFileBasedSpecthat usesab_datetime_try_parsefor flexible datetime parsing - Updated the
patternandpattern_descriptorfields to reflect the broader range of accepted formats - Added test coverage for various datetime format inputs including edge cases
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| airbyte_cdk/sources/file_based/config/abstract_file_based_spec.py | Added validator method and updated field metadata to support flexible datetime formats |
| unit_tests/sources/file_based/config/test_abstract_file_based_spec.py | Added comprehensive test coverage for the new validator with various datetime formats |
| unit_tests/sources/file_based/scenarios/csv_scenarios.py | Updated test expectations to match new pattern and examples |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
unit_tests/sources/file_based/config/test_abstract_file_based_spec.py
Outdated
Show resolved
Hide resolved
airbyte_cdk/sources/file_based/config/abstract_file_based_spec.py
Dismissed
Show dismissed
Hide dismissed
📝 WalkthroughWalkthroughThis PR enhances start_date validation in file-based source configurations by introducing Pydantic validators to enforce ISO8601/RFC3339 format parsing, expanding supported date-time formats, and adding comprehensive test coverage. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/prerelease
|
Aldo Gonzalez (aldogonzalez8)
left a comment
There was a problem hiding this comment.
APPROVED
Co-Authored-By: AJ Steers <aj@airbyte.io>
Validation Evidence: Flexible
|
efad73e
into
main

Summary
Replaces the strict regex pattern for
start_datevalidation inAbstractFileBasedSpecwith a Pydantic validator usingab_datetime_try_parse. This fixes issues where valid ISO8601/RFC3339 datetime strings like2025-01-01T00:00:00Z(without microseconds) were incorrectly rejected.Before: Only accepted
YYYY-MM-DDTHH:mm:ss.SSSSSSZ(exactly 6 microsecond digits required)After: Accepts any valid ISO8601/RFC3339 format via
ab_datetime_try_parseFixes: airbytehq/oncall#9390
Updates since last revision
csv_scenarios.pytest expectations to removepatternandpattern_descriptorfields from the expected spec, aligning with the new Field definitionReview & Testing Checklist for Human
patternandpattern_descriptorfields were removed from the Field definition. Verify this doesn't break UI rendering or connector spec generation for file-based sources.ab_datetime_try_parseis quite flexible (accepts date-only, timestamps, various offsets). Confirm this level of permissiveness is acceptable forstart_date.start_dateformats including2025-01-01T00:00:00Zand2025-01-01T00:00:00.000000Z.Notes
AbstractFileBasedSpec(S3, Azure Blob, SharePoint, etc.)Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.