Amazon Announces the Improvement of ML Models to Better Identify Sensitive Data on Amazon Macie
MMS • Daniel Dominguez
Article originally posted on InfoQ. Visit InfoQ
Amazon is announcing a new capability to create allow lists in Amazon Macie. Now text or text patterns not desire for Macie to report as sensitive data can be specified in allow lists. Amazon Macie is a fully managed data security and data privacy service that uses machine learning and pattern matching to discover and protect sensitive data in AWS.
According to Amazon, when evaluating JSON data in Amazon S3 buckets, Macie has improved the machine learning models used by managed data identifiers to produce more precise and useful results. Extraction of additional information from surrounding fields in JSON data and JSON Lines files improves the machine learning models’ accuracy even further. This enhancement also speeds up the processing of certain kinds of files, which will accelerate the completion of sensitive data finding tasks.
Macie applies machine learning and pattern matching techniques to selected buckets to identify and alert about sensitive data, such as names, addresses, credit card numbers, or credential materials. Identifying sensitive data in S3 can help in compliance with regulations, such as the Health Insurance Portability and Accountability Act (HIPAA) and General Data Privacy Regulation (GDPR).
Once activated, Macie automatically compiles a complete S3 inventory at the bucket level and examines each bucket to detect public access, lack of encryption, sharing, or replication with AWS accounts outside of a customer’s business. If Macie detects sensitive data or potential issues with the security or privacy, it creates detailed findings to review and remediate as necessary.
Analysis of data promises to provide enormous insights for data scientists, business managers, and artificial intelligence algorithms. Governance and security must also ensure that the data conforms to the same data protection and monitoring requirements as any other part of the enterprise.
Tools for identifying such information are useful in the event of a ransomware attack to quickly identify what information could have been compromised and help understand the scope of potential security concerns and fallout. Following security recommendations, to further secure data, workloads, and applications it’s also recommended to combine AWS Security Hub and Amazon GuardDuty with Amazon Macie. Other security tools to consider are OpenSSL, Let’s Encrypt, and Ensighten.