Refinement: Skipping Bad Records
Map/Reduce functions sometimes fail for particular inputs
- Best solution is to debug & fix, but not always possible
- On seg fault:
- Send UDP packet to master from signal handler
- Include sequence number of record being processed
- If master sees two failures for same record:
- Next worker is told to skip the record
Effect: Can work around bugs in third-party libraries