Co-op Change Log - February B2B Data Updates!

Co-op Change Log - February B2B Data Updates!
Photo by Brett Jordan / Unsplash

B2B Full Contact 3.0 & Firmographic 1.3 updates are available.

We added personal emails to the B2B Full Contact Dataset!

Available Product Schemas

  • Full B2B Contact Data 3.0 - Last Updated: 2-3-22
  • MAID/HEM Extract 2.3 - Last Updated: 1-19-22
  • IP/MAID Extract 1.3 - Last Updated: 1-19-22
  • Non-PII Contact Data 1.2 - Last Updated: 1-19-22
  • Firmographic Data 1.3 - Last Updated: 2-3-22
  • B2C Contact Data 1.0 - Target March 29th, 2022

B2B Full Contact 3.0 & Firmographic 1.3 Updates

Goal of this update: Associate personal emails + Improve Industry and Title fields

Headline: Associated 79M unique personal emails to business contacts, providing additional ways to target B2B contacts on social

Additional Updates:

  • Improved the quality of our Industry Array: We've increased the strictness of our Industry/SIC array filtering in order to provide less noise in the SIC and Industry fields; as we receive more contributions, we will continue to monitor the quality of these arrays.
  • Built v1 of Single Contact Resolution: Using multiple datasets in conjunction, we have attempted to identify additional ways to remove duplication in our contact records. This allows us to improve the quality of contact records and reduce the noise in our datasets.
  • Title Spelling Update MVP

Using a known high quality list of titles, we've built a dictionary of title specific words and word frequencies that was then applied using an edit distance algorithm to titles across our database. This algorithm calculates the amount of changes needed to turn an unknown word (token) into a known dictionary word; then updates the token to a corrected word. Words that cannot be corrected due to not having a reasonable counterpart in the dictionary are not removed, but are reflected in a lower overall confidence score of the Title. This model will continue to be iterated on with Algorithm and Dictionary updates as they are developed.

Next 30 Days Focus

  • Digital: Completing the forking of the MAID/HEM product into an identity linkage product (all of our validated MAID/HEM pairs) and a Digital Activation product (MAID/HEM pairs seen in the last 60 days).
  • B2B Full Contact: We received a tremendous amount of email exhaust over the month of January and will be focused on integrating it into the data set. Additionally, we will be adding primary industry, primary SIC, and record last updated into the B2B Schema.
  • Consumer: We’ve finalized the proposed consumer schema. We are targeting the release of this data asset in March. If you are interested in receiving the consumer file, please let us know and we would love your feedback on the schema.
  • Platform: We are focused a lot of effort on automating the standardization and normalization jobs across all of the data, reducing the development time to deliver and freeing up resources for further improvements.

Additionally, we will be reaching out to each of you to ensure your validation data is properly onboarded.

If you are subscribed to the B2B Full Contact & Firmographic data, you will receive a separate email with the bucket locations.

Please email if you have any questions regarding the most recent data delivery updates and if there are any data projects or issues that you’d like to see prioritized that we have not mentioned.

We appreciate all of your feedback,

Nick Weldon, Co-founder of The Data Co-op