Data cleaning: Overview and emerging challenges X Chu, IF Ilyas, S Krishnan, J Wang Proceedings of the 2016 international conference on management of data, 2201 …, 2016 | 669 | 2016 |
Holoclean: Holistic data repairs with probabilistic inference T Rekatsinas, X Chu, IF Ilyas, C Ré arXiv preprint arXiv:1702.00820, 2017 | 539 | 2017 |
Holistic data cleaning: Putting violations into context X Chu, IF Ilyas, P Papotti 2013 IEEE 29th International Conference on Data Engineering (ICDE), 458-469, 2013 | 448 | 2013 |
Katara: A data cleaning system powered by knowledge bases and crowdsourcing X Chu, J Morcos, IF Ilyas, M Ouzzani, P Papotti, N Tang, Y Ye Proceedings of the 2015 ACM SIGMOD international conference on management of …, 2015 | 410 | 2015 |
Data cleaning IF Ilyas, X Chu Morgan & Claypool, 2019 | 339 | 2019 |
Discovering denial constraints X Chu, IF Ilyas, P Papotti Proceedings of the VLDB Endowment 6 (13), 1498-1509, 2013 | 316 | 2013 |
Detecting data errors: Where are we and what needs to be done? Z Abedjan, X Chu, D Deng, RC Fernandez, IF Ilyas, M Ouzzani, P Papotti, ... Proceedings of the VLDB Endowment 9 (12), 993-1004, 2016 | 305 | 2016 |
Trends in cleaning relational data: Consistency and deduplication IF Ilyas, X Chu Foundations and Trends® in Databases 5 (4), 281-393, 2015 | 194 | 2015 |
CLAMS: bringing quality to data lakes M Farid, A Roatis, IF Ilyas, HF Hoffmann, X Chu Proceedings of the 2016 International Conference on Management of Data, 2089 …, 2016 | 115 | 2016 |
Cleanml: A study for evaluating the impact of data cleaning on ml classification tasks P Li, X Rao, J Blase, Y Zhang, X Chu, C Zhang 2021 IEEE 37th International Conference on Data Engineering (ICDE), 13-24, 2021 | 109 | 2021 |
Zeroer: Entity resolution using zero labeled examples R Wu, S Chaba, S Sawlani, X Chu, S Thirumuruganathan Proceedings of the 2020 ACM SIGMOD International Conference on Management of …, 2020 | 98 | 2020 |
Distributed data deduplication X Chu, IF Ilyas, P Koutris Proceedings of the VLDB Endowment 9 (11), 864-875, 2016 | 98 | 2016 |
Qualitative data cleaning X Chu, IF Ilyas Proceedings of the VLDB Endowment 9 (13), 1605-1608, 2016 | 82 | 2016 |
Transform-data-by-example (TDE) an extensible search engine for data transformations Y He, X Chu, K Ganjam, Y Zheng, V Narasayya, S Chaudhuri Proceedings of the VLDB Endowment 11 (10), 1165-1177, 2018 | 81 | 2018 |
Cleanml: A benchmark for joint data cleaning and machine learning [experiments and analysis] P Li, X Rao, J Blase, Y Zhang, X Chu, C Zhang arXiv preprint arXiv:1904.09483 75, 2019 | 63 | 2019 |
Tegra: Table extraction by global record alignment X Chu, Y He, K Chakrabarti, K Ganjam Proceedings of the 2015 ACM SIGMOD international conference on management of …, 2015 | 53 | 2015 |
Nearest neighbor classifiers over incomplete information: From certain answers to certain predictions B Karlaš, P Li, R Wu, NM Gürel, X Chu, W Wu, C Zhang arXiv preprint arXiv:2005.05117, 2020 | 51 | 2020 |
Omnifair: A declarative system for model-agnostic group fairness in machine learning H Zhang, X Chu, A Asudeh, SB Navathe Proceedings of the 2021 international conference on management of data, 2076 …, 2021 | 50 | 2021 |
KATARA: reliable data cleaning with knowledge bases and crowdsourcing X Chu, J Morcos, IF Ilyas, M Ouzzani, P Papotti, N Tang, Y Ye Proceedings of the VLDB Endowment 8 (12), 1952-1955, 2015 | 47 | 2015 |
Sema-join: joining semantically-related tables using big table corpora Y He, K Ganjam, X Chu Proceedings of the VLDB Endowment 8 (12), 1358-1369, 2015 | 47 | 2015 |