Title
Automated cleansing for spend analytics
Abstract
The development of an aggregate view of the procurement spend across an enterprise using transactional data is increasingly becoming a very important and strategic activity. Not only does it provide a complete and accurate picture of what the enterprise is buying and from whom, it also allows it to consolidate suppliers, as well as negotiate better prices. The importance, as well as the complexity, of this cleansing exercise is further magnified by the increasing popularity of Business Transformation Outsourcing (BTO) wherein enterprises are turning over non-core activities, such as indirect procurement, to third parties, who now need to develop an integrated view of spend across multiple enterprises in order to optimize procurement and generate maximum savings. However, the creation of such an integrated view of procurement spend requires the creation of a homogeneous data repository from disparate (heterogeneous) data sources across various geographic and functional organizations throughout the enterprise(s). Such repositories get transactional data from various sources such as invoices, purchase orders, account ledgers. As such, the transactions are not cross-indexed, refer to the same suppliers by different names, and use different ways of representing information about the same commodities. Before an aggregated spend view can be developed, this data needs to be cleansed, primarily to normalize the supplier names and correctly map each transaction to the appropriate commodity code. Commodity mapping, in particular, is made more difficult by the fact that it has to be done on the basis of unstructured text descriptions found in the various data sources. We describe an on-demand system to automatically perform this cleansing activity using techniques from information retrieval and machine learning. Built on standard integration and application infrastructure software, this system provides enterprises with a fast, reliable, accurate and on-demand way of cleansing transactional data and generating an integrated view of spend. This system is currently in the process of being deployed by IBM for use in its BTO practice.
Year
DOI
Venue
2005
10.1145/1099554.1099682
International Conference on Information and Knowledge Management
Keywords
Field
DocType
multiple enterprise,knowledge management,transactional data,indirect procurement,cleansing activity,commodity mapping,data source,integrated view,cleansing exercise,various data source,aggregate view,information retrieval,homogeneous data repository,spend analysis,unstructured data,automated cleansing,transaction data,indexation,machine learning
Spend analysis,Data mining,Computer science,Outsourcing,Unstructured data,Information repository,Procurement,Analytics,Transaction data,Purchase order
Conference
ISBN
Citations 
PageRank 
1-59593-140-6
1
0.35
References 
Authors
5
5
Name
Order
Citations
PageRank
Moninder Singh1381105.12
Jayant Kalagnanam22015151.88
Sudhir Verma310.69
Amit J. Shah410.35
Swaroop K. Chalasani510.35