01 April 2011

Aggregation

Posts relating to the category tag "aggregation" are listed below.

01 April 2011

ICO Data Anonymisation Seminar

Earlier this month I discussed a seminar being organised by the Information Commissioner's Office (ICO). I was fortunate enough to be able to attend the event on Wednesday at The Wellcome Trust on Euston Road in London.

Photograph of the Information Commissioner Christopher Graham introducing the ICO Data Anonymisation Seminar at the ICO Data Anonymisation Seminar in London, UK

The event began with a welcome from Christopher Graham (Information Commissioner, ICO). He explained the seminar was not a theoretical debate about legal definitions, but instead a discussion of the current and emerging practical risks of re-identification. In particular he hoped ideas would form on how best to assess and mitigate the privacy risks of some form of statistic leading to someone being identified.

Sir Mark Walport (Director, The Wellcome Trust) continued on this theme but focused on the medical research sector. He explained that having good data is inextricably linked to good public health. He outlined various benefits of sharing data to individuals and the public, and identified proportionality, choice of terms of service and confidentiality vs. consent as the key issues. He also touched on some of the content in the Data Sharing Review Report, written in conjunction with the previous Information Commissioner Richard Thomas.

Photograph of Sir Mark Walport, Director of the Wellcome Trust, speaking at the ICO Data Anonymisation Seminar in London, UK

Dr Mark Elliot (University of Manchester) discussed anonymisation as disclosure avoidance and the need for formal disclosure risk assessments. This can include undertaking simulated data intrusions to help rank file riskiness, in a similar way an organisation might rank processes or applications by other forms of operational risk. He explained such processes need to consider the intruder's motivations, the consequences of disclosure (to individuals, organisations and society), but also that it needed to take into account the issue of spontaneous recognition.

Following a short break, Nicola Westmore (Cabinet Office) outlined the government's transparency agenda which has the aims to promote efficiency & effectiveness, improve public services and allow citizens to make an informed choice. She talked about the privacy risks inherent in data.gov.uk and the drivers for government data disclosure.

Dr Kieron O'Hara (University of Southampton) asked whether transparency will pose a threat to privacy, especially in the areas of crime data and demand-driven transparency which he believes will be strongest in the area of health, education and court data. He said that privacy is not only a legal matter — it is not just data protection, as this is insufficient to retain trust, the law has grey areas, and citizens' perceptions do not follow the content of the Data Protection Act. He felt the law was not the answer and a discussion was needed between transparency activists, privacy activists, technical experts, domain experts and information entrepreneurs. He would also like to see auditable debate trails by organisations making decisions as to whether and what data are released.

Dr Marie Cruddas (Office for National Statistics) talked about the balance between data utility and risk. She walked through the confidentiality protection framework, used to determine how data are released by the ONS. This considers the end-user requirements, data quality, sensitivity, age, coverage and other characteristics, a disclosure risk assessment, disclosure controls (legal, ethical and practical), management of disclosure risk and implementation. An interesting idea was the concept of undertaking a penetration test on data sets, to see how they can be re-identified alone, or together with other data sets.

Photograph of delegates gathering again after lunch at the ICO Data Anonymisation Seminar in London, UK

Once delegates had re-assembled from the lunch break, Paul Ohm (University of Colorado) described how there is a perception that anonymisation is ubiquitous, trusted and rewarded by law in terms of benefits and exemptions. He described how even relatively innocuous data can be used to identify individuals and discussed how policy makers should respond. He believes lists of personally identifiable information (PII) are unsustainable and that technology will not be a solution, partly due to the accretion problem where we creep closer and closer to personal data releases. He believes in the use of contextual risk assessments, best effort approaches, consideration of risks, motives & criminal behaviour, accountability measures and reduction in unjustifiably risks collection of information. I can see how threat modelling can be extended into this area further.

Barry Ryan (Market Research Society) provided a background to the MRS' principles, from classical research to how this has changed through the use of non-anonymous participation, qualitative groups, online market research communities, and ethnographic and deliberative techniques. Research clients often provide individual contacts, and they are demanding more information which is more detailed.

Photograph of the facilitated discussion panel members at the ICO Data Anonymisation Seminar in London, UK - the speakers already sat down are left-to-right Kieron O'Hara, Paul Ohm, Barry Ryan, Nicola Westmore, and David Smith - they were joined shortly by Marie Cruddas and Mark Elliot

David Smith (Deputy Commissioner and Director of Data Protection, ICO) chaired the panel discussion where the speakers discussed whether access controls are useful, the rights of individuals to compensation and redress, audit trails for data downloads, the usefulness of a register of data controllers, anonymisation as a failed concept, the influence of China on the internet with its focus on traceability, the need for trust, effort needed in the education system and, inevitably, the need for further research.

David Smith thanked all the speakers and provided an engaging summary of the seminar. Since he considered the outcome was that true anonymisation is not possible, this made summing up more difficult. The ICO will develop and issue a report on the day, together with the presenter's slides, and David Smith asked if there were any further contributions, to forward them to the ICO.

Photograph of Paul Ohm and Mark Elliot talking after the close of the ICO seminar on Data Anonymisation at the Wellcome Trust, Euston Road in London on Wednesday 30th March 2011

My own conclusions? The situation is complicated, and there isn't yet agreement on the best path forward. Anonymisation is a partial privacy protection method, but data can almost always be re-identified and therefore it cannot be relied upon as a definitive protective measure, or as an excuse/exemption from data protection requirements. It seems there may be a move towards risk assessments rather than specified conditions and controls.

But do read Paul Ohm's paper Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization which I highlighted in a previous post about test data. He also provided the best quotation of the day: "Data can be either useful or perfectly anonymous but never both".

Update 5th August 2011:A report of the proceedings is now available.

Posted on: 01 April 2011 at 08:55 hrs

Comments Comments (0) | Permalink | Send Send | Post to Twitter

02 July 2010

Web Site Security Basics for SMEs

Sometimes when I'm out socially and people ask what I do, the conversation progresses to concerns about their own web site. They may have a hobby site, run a micro-business or be a manager or director of a small and medium-sized enterprise (SME)—there's all sorts of great entrepreneurial activity going on.

It is very common for SMEs not to have much time or budget for information security, and the available information can be poor or inappropriate (ISSA-UK, under the guidance of their Director of Research David Lacey, is trying to improve this). But what can SMEs do about their web presence—and it is very unusual not to have a web site, whatever the size of business.

Photograph of a waste skip at the side of St John Street in Clerkenwell, London, UK, with the company's website address written boldly across it

Last week I was asked "Is using <company> okay for taking online payments?" and then "what else should I be doing?". Remember we are discussing protection of the SME's own web site, not protecting its employees from using other sites. If I had no information about the business or any existing web security issues, this is what I recommend checking and doing before anything else:

  • Obtain regular backup copies of all data that changes (e.g. databases, logs, uploaded files) and store these securely somewhere other than the host servers. This may typically be daily, but the frequency should be selected based on how often data changes and how much data the SME might be prepared to lose in the event of total server failure.
    • check backup data can read and restored periodically
    • don't forget to securely delete data from old backups when they are no longer required
  • Use a network firewall in front of the web site to limit public (unauthenticated user) access to those ports necessary to access the web site. If other services are required remotely, use the firewall to limit from where (e.g. IP addresses) these can be used.
    • keep a record of the firewall configuration up-to-date
    • limit who can make changes to the firewall
  • Ensure the host servers are fully patched (e.g. operating system, services, applications and supporting code), check all providers for software updates regularly and allow time for installing these.
    • remove or disable all unnecessary services and other software
    • delete old, unused and backup files from the host servers
  • Identify all accounts (log in credentials) that provide server access (not just normal web page access), such as used for transferring files, accessing administrative interfaces (e.g. CMS admin, database and server management/configuration control panels) and using remote desktop. Change the passwords. Keep a record of who has access and remove accounts that are no longer required and enable logging for all access using these accounts.
    • restrict what each account can do as much as possible
    • add restrictions to the use of these accounts (e.g. limit access by IP address, require written approval for use, keep account disabled by default)
  • Check that every agreement with third parties that are required to operate the web site are in the organisation's own name. These may include the registration of domain names, SSL certificates, hosting contracts, monitoring services, data feeds, affiliate marketing agreements and service providers such as for address look-up, credit checks and making online payments.
    • ensure the third parties have the organisation's official contact details, and not those of an employee or of the site's developers
    • make note of any renewal dates
  • Obtain a copy of everything required for the web site including scripts, static files, configuration settings, source code, account details and encryption keys. Keep this updated with changes as they are made.
    • verify who legally owns the source code, designs, database, photographs, etc.
    • check what other licences affect the web site (e.g. use of open source and proprietary software libraries, database use limitations).

Do what you can, when you can. Once those are done, then:

  • Verify the web site and all its components (e.g. web widgets and other third party code/content) does not include common web application vulnerabilities that can be exploited by attackers (e.g. SQL injection, cross-site scripting).
  • Check what obligations the organisation is under to protect business and other people's data such as the Data Protection Act, guidance from regulators, trade organisation rules, agreements with customers and other contracts (e.g. PCI DSS via the acquiring bank).
    • impose security standards and obligations on suppliers and partner organisations
    • keep an eye open for changes to business processes that affect data
  • Document (even just some short notes) the steps to rebuild the web site somewhere else, and to transfer all the data and business processes to the new site.
    • include configuration details and information about third-party services required
    • think about what else will need to be done if the web site is unavailable (does it matter, if so what exactly is important?)
  • Provide information to the web site's users how to help protect themselves and their data.
    • point them to relevant help such as from GetSafeOnline, CardWatch and Think U Know
    • provide easy methods for them to contact the organisation if they think there is a security or privacy problem
  • Monitor web site usage behaviour (e.g. click-through rate, session duration, shopping cart abandonment rate, conversion rate), performance (e.g. uptime, response times) and reputation (e.g. malware, phishing, suspicious applications, malicious links) to gather trend data and identify unusual activity.
    • web server logs are a start, but customised logging is better
    • use reputable online tools (some of which are free) to help.

That's just the basics. So, what would be next for an SME? If the web site is a significant sales/engagement channel, the organisation has multiple web sites, is in a more regulated sector or one that is targetted particularly by criminals (e.g. gaming, betting and financial), takes payments or does other electronic commerce, allows users to add their own content or processes data for someone else, the above is just the start. Those SMEs probably need to be more proactive.

This helps to protect the SME's business information, but also helps to protect the web site users and their information. After all, the users are existing and potential customers, clients and citizens.

Oh, the best response I had to someone when I was explaining my work: "You're an anti-hacker than?". Well, I suppose so, but it's not quite how I'd describe it.

Any comments or suggestions?

Posted on: 02 July 2010 at 08:18 hrs

Comments Comments (0) | Permalink | Send Send | Post to Twitter

26 February 2010

Identifiability and Traceability Online

Last month I described the ability to track users sessions with browser data. A recent posting on IT Law in Ireland highlighted a series of blog posts elsewhere that give further insight into what is possible.

Photograph of the exhibit 'L-E-D-LED-L-ED' by Dilight at the London Design Museum, consisting of hundreds of bead-shaped light emitting diodes (LEDs) that can slide back and forth along a series of horizontal wires

Well, I just got round to reading them properly. The posts on Freedom to Tinker by David Robinson and Harlan Yu are:

The conclusion? It is possible to trace and identify individuals easier than you may think. We are dropping evidence like dead skin cells as we traverse the internet. Fact or fiction? Well the US Defense Advanced Research Projects Agency (DARPA) are taking it seriously with a recent call for research into cyber genetics, cyber anthropology and cyber physiology in its Cyber Genome Program. DARPA hopes to develop advanced methods to fingerprint or identify the origins of a cyber attacks by examining digital artifacts, and presumably other criminal activities utilising computer technology.

Getting a bit more down to earth, web site owners need to consider what information is being gathered and why, ensure this is legal, check that consent is implied or has been explicitly given for the purposes and what monitoring and analysis is performed on the data. It could be easy for system developers to carried away with tracking and tagging. Contracts with third parties should state clearly what the expectations are about the security and privacy of information, to protect web site users (employees, customers, clients, citizens) and the business.

Posted on: 26 February 2010 at 09:06 hrs

Comments Comments (0) | Permalink | Send Send | Post to Twitter

29 December 2009

Adverts and Privacy Notices

The Interactive Advertising Bureau (IAB) and Association of American Advertising Agencies (4A's) have published a draft revised Standard Terms and Conditions for Interactive Advertising. Whilst this is principally aimed at the USA market, due to the international nature of the Internet, I thought it worth a mention here.

Photograph of a shop's SALE banner beside various London souvenirs and other gifts

Use of the template (full title "Standard Terms and Conditions for Interactive Advertising for Media Buys One Year or Less") is voluntary and open to negotiation between media companies and advertisers. However it does discuss data usage and privacy. This is important if you have advertising on your own web site and need to write a privacy notice. Without knowing the agreement between the advertiser and media company, how can you inform your web site users what will happen to their personal information? Although this is only an example template, it probably contains most of the likely issues you will come across in other ones. The definitions of "user volunteered data", "performance data", "site data" and "use of collected data" probably need careful reading and advice from a lawyer! The education version provides some further explanation of terminology and the changes since the previous version.

The template also describes the "special situation of User-Generated Content (UGC) pages" on advert placement and positioning—there could be an interesting discussion if the actual content was neither that intended by the site owner, nor that added by the user, but instead was the result of some malicious injection.

There doesn't seem to be any reference to malware on the site or malware delivered by the advert.

Of course, including third party content is a risk that should be considered in itself.

Posted on: 29 December 2009 at 10:28 hrs

Comments Comments (0) | Permalink | Send Send | Post to Twitter

11 December 2009

Consultation on the Personal Information Online Code of Practice

On Wednesday I attended the Information Commissioner's Office (ICO) Personal Information Online Conference 2009 at which the ICO launched their consultation on the new Personal Information Online Code of Practice.

Photograph of an old office block and new apartment block in the heart of Manchester, near to the conference venue, the Lowry Hotel

Manchester and Salford gave us a beautiful sunny day for the event which briefed delegates on the ICO's approach to data protection and an outline of the collaborative process used to develop the draft code of practice. Iain Bourne, Head of Data Protection projects, noted that fewer than hoped public sector organisations had been involved to date, and they would like more feedback from this sector in particular during the consultation phase that ends on 5 March 2009.

Photograph of David Smith, Deputy Information Commissioner, giving the Personal Information Online Conference 2009 keynote address at the Lowry Hotel, Manchester

My first impressions are this will be a useful document for organisations without staff dedicated to data protection or compliance, especially once the examples and SME checklist are added. The structure and content are still a little raw, but probably about right for the start of a 12-week consultation process. Areas where I am already considering providing feedback are:

  • local storage of personal information (not just cookies)
  • verification of protection
  • suppliers, sub-contractors and staff
  • monitoring and anomaly detection
  • transmission of personal information
  • inclusion of third party content in web sites
  • using cookies to enforce an opt out
  • additional reference materials.

The full text and consultation document is available as a PDF.

Feedback on the Personal Information Online Code of Practice can be provided using the ICO's consultation portal with further background available in the related press release.

Posted on: 11 December 2009 at 10:56 hrs

Comments Comments (0) | Permalink | Send Send | Post to Twitter

20 November 2009

Layered Communications and the Web Site Concentrator

Examples of content aggregation often refer to the use of web services and XML data such as RSS feeds. But today's world of web 2.0 in creating more and more data in a wide variety of formats including JSON (JavaScript Object Notation); and web applications are being used as a concentrator to combine these together.

With the growth of layered communications, multiple communication channels such as text, video and audio are merged into one event. If the content is recorded it can be republished via a web site. But what are the specific security risks of this?

Web services and XML data can include invalid or malicious data. The format/schema may be incorrect. But with the increase in layered communications, content from many different devices in many media may need to be aggregated into a single resource; and these often don't have any formal syntactical structure. The data might even include active content such as embedded rich applications.

Diagram showing six data feeds (voice, text, photograph, application video and ?/other) contributing to the output from a web application

If these need to be stored and replayed such content at a later date, how might they affect a web page? The content could contain, or link to, malicious content that steals user data such as session cookies, modifies the page's content or installs malware onto user's computers.

  • Identify all the data streams.
  • Determine their formats and encoding where appropriate.
  • Ruthlessly limit what active (script) content is allowed and what ability it has to interact with the parent web site and its domain.
  • Analyse the data streams to validate they contain what is intended and scan for malware.
  • Sanitise content where applicable.
  • Limit file size/length/number of nodes.
  • Avoid merging trusted and untrusted content in data fields.
  • Encode the output correctly for your own application.
  • Monitor activity and look out for unusual events.

And beware embedding rich internet applications (RIAs) such as Adobe Flash or Microsoft Silverlight, which may be doing this aggregation themselves.

After all, you don't want your web site to be a concentrator multiplexing malware.

Posted on: 20 November 2009 at 12:20 hrs

Comments Comments (0) | Permalink | Send Send | Post to Twitter

Aggregation : Web Security, Usability and Design
http://www.clerkendweller.com/aggregation
ISO/IEC 18004:2006 QR code for http://clerkendweller.com

Page http://www.clerkendweller.com/aggregation
Requested by 38.107.179.223 on Saturday, 4 February 2012 at 21:27 hrs (London date/time)

Please read our terms of use and obtain professional advice before undertaking any actions based on the opinions, suggestions and generic guidance presented here. Your organisation's situation will be unique and all practices and controls need to be assessed with consideration of your own business context.

Terms of use http://www.clerkendweller.com/page/terms
Privacy statement http://www.clerkendweller.com/page/privacy
© 2008-2012 clerkendweller.com