Large enterprises never have just one or two data sources. It’s always tens or hundreds of places they need to pull data from, if not thousands. What may have started out as a couple of shell scripts in the early days of a company, may have turned into a crontab of tasks, which morphs into a spider-web of cron-dependencies, and eventually you’ve got program management sniffing down your department’s neck saying “Why does this ETL process keep failing, and costing us money?”. …

Mark McCracken

