Stream C - Technology: Simon Kravis

Presentation Title: What’s important and what’s not: data cleansing in organisational infrastructure.  
Stream: Technology
Presenter:
Name Simon Kravis
Organisation KAZ Group
Title Dr
 
Short Biography
IT consultant since 2004, with KAZ since 2005, working on auditing and remediation of shared storage, automatic classification of text documents and migration of documents from file systems to document management systems. Auditing experience in a wide range of organisations.

Senior Research Scientist at CSIRO Mathematical & Information Sciences 1990-2004, working on scientific visualisation, parallel computation, knowledge management for oil and gas drilling, collaborative environments.

 

 

About Presentation (Abstract)

Data cleansing is only undertaken by organisations under duress: when loading existing documents into a document management system or when file servers run out of capacity. The task involves deciding which documents are important is difficult to automate as the huge range of applications which create or manipulate files do not compel users to provide the metadata required for this task, and automatic generation of all the required metadata for filing requires a level of automated understanding of language which computers do not yet have. Even the task of deciding whether documents have been used recently or not is prone to disastrous error if based on native filesystem metadata. Lack of cleansing on filesystems increases the rate of storage growth and associated management costs. Corporate risk is also increased if there is no differentiation of documents by importance within storage.

How can data cleansing be made a normal and inexpensive procedure instead of an occasional and very costly one? Key technologies are identification of date of last use of files to facilitate automatic archiving, de-duplication and document management systems. We examine how these technologies can be deployed to maintain a healthy filesystem.

 



Admin