Uber’s HiveSync team optimized Hadoop Distcp to handle multi-petabyte replication across hybrid cloud and on-premise data lakes. Enhancements include task parallelization, Uber jobs for small ...
Suggested Citation: "8 Session 7: Humans and Machines Working Together with Big Data." National Academies of Sciences, Engineering, and Medicine. 2017. Challenges in Machine Generation of Analytic ...