Some users who use Snowflake as a destination may experience increased destination costs when coming from other vendors.
Why does this happen?
The primary reason for this is a result of the way Airbyte was architected, as we do not store any customer data on our servers, whereas other vendors do (for example: https://fivetran.com/docs/security#retentionofcustomerdata).
How is Airbyte different?
We move data by performing Typing & Deduping in the destination, whereas Fivetran does this in-memory as it processes incoming data. We went this route because it is faster and more resilient to certain classes of errors (ex. data is too big, or data types mismatch).
Suggestions for reducing Snowflake destination costs
If you'd prefer to fully disable Typing & Deduping in Snowflake, you can toggle off the creation of those final tables by toggling "Disable Final Tables" in the destination configuration for Snowflake.
If you would prefer to continue to use Typing & Deduping, we also have also recently introduced a new beta option to test reducing costs. To use this feature, you can toggle on "Use MERGE for De-duplication of final tables" in the destination configuration. If you do test this, feel free to add your feedback to our active GitHub discussion here.
Lastly, if either of those aren't great suggestions for what you are looking for, you can also check out the blog article our Engineering team published with a few suggestions on how to mitigate the Typing & Deduping costs here:
https://airbyte.com/blog/cost-conscious-advanced-elt-strategies-for-data-deduplication
Comments
Please sign in to leave a comment.