Fall 2006 Seminar

 
 
       
  Machine Learning Seminar Series
 
 
  Seminar Schedule (Seminar Organizer: Prof. Ziv Bar-Joseph)
 

 

ML/Google Seminars

Machine Learning Lunchtime Chats

 

 

Date: November 20, 2006
Time: 3:00 PM - 4:30 PM
Location: 1305 NSH
Speaker: David Steier Center for Advanced Research
PricewaterhouseCoopers LLP
Title: Large Scale Detection of Irregularities in Accounting Data
Abstract: In recent years, there have been several large accounting frauds where a company's financial results have been intentionally misrepresented by billions of dollars. In response, regulatory bodies have mandated that auditors perform analytics on detailed financial data with the intent of discovering such misstatements. For a large auditing firm, this may mean analyzing millions of records from thousands of clients. This paper proposes techniques for automatic analysis of company general ledgers on such a large scale, identifying irregularities which may indicate fraud or just honest errors for additional review by auditors. These techniques have been implemented in a prototype system, called Sherlock, which combines aspects of both outlier detection and classification. In developing Sherlock, we faced three major challenges: developing an efficient process for obtaining data from many heterogeneous sources, training classifiers with only positive and unlabeled examples, and presenting information to auditors in an easily interpretable manner. In this paper, we describe how we addressed these challenges over the past two years and report on experiments evaluating Sherlock. After three years or working under non-disclosure, this work will be presented at the International Conference on Data Mining in December. As a proud alum (PhD, SCS '89), I am pleased to say that this preview at CMU will be the first external talk we are giving on the details of Sherlock.
Speaker Bio: David Steier received his PhD in Computer Science in 1989. His dissertation, "Automating algorithm design within a general architecture for intelligence" was supervised by Allen Newell. After graduating, he spent three more years at Carnegie Mellon as a member of the research faculty in the Engineering Design Research Center, teaching classes in software engineering and applying artificial intelligence techniques to problems ranging from scheduling the manufacture of automobile windshields to the creating new interfaces for the NASA Test Director, who coordinates the space shuttle launch sequence. After CMU, David hiked the southern part of the Appalachian Trail, then moved to California to join the staff of the Price Waterhouse World Technology Centre, where he eventually became Director of R&D. He and the group worked on applications of AI to management of semi-structured data, including financial SEC filings, newswire articles, and the firm's internal Notes databases. He left PW in 1998 to join Scient, a professional services firm that created e-businesses, to help with both internal knowledge management applications and consulting in KM for clients, and then spent a year as Senior Director of Technology and Business Development at Kanisa, a web-based customer self-service software vendor. Now he is back at PricewaterhouseCoopers, applying AI techniques to the detection of corporate fraud.