AI, Machine Learning, and Python: How to protect your AI-data model
2024-06-06 John Poulson
At the recent Automate Show held in Chicago, IL USA, thousands of industrial automation enthusiasts convened to learn about and discuss the latest technologies and solutions, from collaborative robots to ultra-efficient manufacturing processes and smart factories. Much of the buzz on the show floor was centered around novel applications of AI and machine learning (right at the “end of the arm”) geared toward helping companies invent solutions not typically found in the realm of machine vision and robotics. Use cases for cutting-edge AI and vision technologies demonstrated how these breakthroughs are enabling factories to optimize operations and powering a new wave of smart robotics capable of performing human-like tasks.
Many of these innovations, of course, are driven by software and more often data – Intellectual Property that must be licensed and protected from reverse engineering, tampering, and counterfeiting. Let’s talk about a concrete example of monetization of an AI-based application.
In this case, a medical device company is developing an AI-based device for early detection of Parkinson’s disease. To further development, data is needed from both sick and healthy people. The data includes more than just blood values, such as the detection of alpha-synuclein in the skin, or symptoms such as heartburn, indigestion, or sexual dysfunction. As you can see, a large part of the intellectual property in this case also consists of knowing which data needs to be evaluated. After collecting the required data, the next step is to check the data and detect any measurement errors. In the next step, AI is used to create a data model from this raw data, e.g. as an H5 file.
The actual device now consists of input elements, software developed in Python, for example using tensorflow or pytorch, and the H5 file. The doctor using the device records the patient's data and enters it into the device. The device then uses the data model to calculate a prediction for Parkinson's disease. The earlier this diagnosis is made (e.g., 10 to 20 years before the onset), the more measures can be taken that can have a positive influence on the course of the disease.
The primary intellectual property worth protecting in this case is the H5 file. The medical device company's effort lies primarily in knowing which data needs to be recorded, the data collection itself, and sorting out of the data. However, a copycat could use this data to produce a comparable device at a fraction of the development cost.
Wibu-Systems‘ AxProtector Python prevents this illegal replication. With AxProtector Python, both the application and the data are encrypted. To ensure secure data protection, the medical device manufacturer uses a secure hardware element (CmDongle) as an ASIC. The keys for the software and the data are stored in the CmDongle in a way that cannot be read. Since the CmDongle is permanently installed in the device as an ASIC, it is not possible to simply remove it from the device. As an alternative to the CmDongle, a computer-bound, encrypted license file (CmActLicense) or a user-based container (CmCloudContainer) in the cloud could also be used. The CmActLicense is suitable for low-cost applications with lower security. If the application is not used in an offline device but in the cloud, then the CmCloudContainer is ideal. In our case, offline use with the highest level of security is required, which is why the CmDongle is used.
These keys are used by AxProtector Python to encrypt the application by activating the “FileEncryption” option. In addition to protection, this option adds the functionality of reading any protected, i.e., encrypted, data files to the application. Reading takes place in a protected environment and only if the key for the data file is available. In addition, the data file is encrypted with AxProtector Python.
In production, the protected application and the protected data file are installed on the device and the appropriate key is transferred to the CmDongle. The device is now functional and ready for commercialization, once the appropriate medical approvals are in place.
In our case, the CmDongle contains a permanent license, i.e., the key can be used for an unlimited period. The doctor pays once for the device. It is also conceivable to rent the device as a subscription or charge a pay-per-use fee, i.e., a fee per prediction. CodeMeter offers these options with CmDongles as well as CmActLicenses and CmCloudContainers. Updates can also be monetized. In this case, a new data model is encrypted with a new key, i.e., a different license. Physicians who purchased this new license would then be able to use the updated data model.
If you are developing AI-based applications with Python, I encourage you to watch our recorded Webinar, Protecting Python Applications the Simpler Way, and learn how to protect the know-how that you have built into your Python applications and your data from reverse engineering and how you can use CodeMeter to monetize this investment into your products.
Contributor
John Poulson
Sr. Account Manager
A senior manager and well respected security industry expert, John has worked in business development and sales for Wibu-Systems USA since 2001. When not consulting with customers on software licensing and protection solutions, John attends industry trade shows and conferences to stay abreast of the latest developments in the IT world. Prior to Wibu-Systems, John worked for Micro Security Systems, Eagle Data, and Griffin Technologies, all pioneers in software security.
Over the years, John has authored several blog articles on topics of general interest in cryptography as well as monetization of embedded systems in new and innovative ways.