The Five NineThe Five Nine

What is data curation and why does it matter?

View descriptionShare

This week we’re going back down the AI rabbit hole, but we’re venturing down a new tunnel to talk about something called data curation. 

Though AI is still a developing technology, it’s well enough known at this point that models are only as good as the data they’re trained on. But for enterprises looking to fine tune publicly available models, it can be a challenge to make sure they’re making the right data available. Why? Well, the vast majority of enterprise data is what is known as unstructured data. That includes any data that’s not numeric – photos, videos, emails, PDFs, you name it.  

Enter data curation – which is basically just the process of sorting through all this data to decide what is relevant to train the model and what’s not. Today this is mostly a tedious, manual process. But is it even worth the hassle? 

We spoke to Vincent Chen, Director of Product and Founding Engineer at Snorkel AI to get the lowdown on how data curation works, why it matters and whether it’s worth the hassle.  

To learn more about the topics in this episode: 

  • Facebook
  • X (Twitter)
  • WhatsApp
  • Email
  • Download

In 1 playlist(s)

  1. The Five Nine

    60 clip(s)

The Five Nine

Join host Diana Goovaerts and other Fierce Network editors as they explore the stories behind some o 
Social links
Follow podcast
Recent clips
Browse 62 clip(s)