Are your Google Docs leaking?
Just the other day I read an article on Infoworld that claimed that “Google Docs is ‘widely used’ at 1 in 5 workplaces”.
Almost at the same time our Google Apps admin got an email from Google titled “Google Apps: Important changes coming to published Google Docs” asking that a certain option be checked otherwise all the docs shared outside our domain will be indexed and available on the world wide web….
20% is a big number. I mean, I knew that there are companies out there that use Google Apps – and specifically google docs – to run their business (I had that insight since we at Aprigo are doing exactly that). This triggers a very important topic: being able to have a ‘holistic view’ or unified way to manage data, be it locally or in the cloud.
There are some problems that manifest themselves when your data is on premise but less so when it’s in the cloud. A good example would be the cost of that data. Even though storage prices drop all the time, it still costs more than what Google charges you for that space – $0 . Other problems exist both for your on-premise data and your cloud data. An example for that would be ‘Access Management’, or being able to answer questions like:
- Who has access to a specific data set?
- Who shouldn’t have access to a data set?
- What’s a accessible to a specific user or group?
- Who’s doing what with the data?
I find most companies take a hybrid approach. If you’re a relatively young company, you might have started with everything in the cloud, but over time gradually adding local applications and infrastructure. For example, you might set the company up with:
- Google Apps – for email,calendar, documents etc.
- AWS – as your virtual infrastructure
- An online Bug tracking system, expense system etc.
But then a few things happen:
- At some point the engineering team requests a local file server so they can share files that are either unsupported on Google docs or too large to be transferred across the WAN for the guys sitting next to each other to access
- At yet another time, the QA team wants to deploy a few local VMs so they can do some performance testing that’s not possible to do on AWS
- Users start keeping local copies of their google docs
- Other users insist on using Microsoft Office or Apple iWork because “There are some things I just can’t do on Google spreadsheets”
The end result? You end-up with a ‘Hybrid Infrastructure’ – some of it is local, some of it is in the cloud, some of the data is local, some of it is in the cloud.
If you’re a more established company, you probably started with 100% of your infrastructure on-premise, but then:
- You launch an IT project that spawns a few VMs on Amazon Web Services
- A few of your users are using Google Docs to more easily share data with co-workers and even external suppliers, customers etc
- The sales team is standardizing on Salesforce.com
In this case most likely 80% of your data is on-premise and only 20% is in the cloud but you notice the trend here, and even that 20% needs to be managed.
When a company has its data fragmented on-premise and in-the cloud, the IT management applications it uses (specifically Data Management Applications) need to support that fragmented environment.
It’s one thing to have different classes of IT management applications for on-premise and cloud environments. If I care about ‘Access Management’, then ideally the same app can also provide a ‘Cloud Access Management’ capability. Taking this even further would be if those capabilities were able to provide a holistic view of my data regardless of where it sits. An example use-case where a holistic view would be helpful:
- Using my access management application, I’m able to lock down the finance folder on my file server.
- A ‘finance user’ uploads a document from the finance folder to Google docs and shares it with internal & external users
- My ‘Cloud Access Management’ application is able to alert me that there are different access permissions for the same document in those two different locations and that a potential data leak exists.
The cloud storage trend is not going away. This isn’t just a fad. And if 20% of workplaces are currently using things like Google Docs, think of how quickly that number is going to rise. The time for IT management apps that can understand access, permissions, rights and security both on-premise and in the cloud is now.
CloudLock For Google Apps helps Google Apps administrators secure access to their Google Docs and Sites. 7 day free trials are available on the Google Apps Marketplace.




















{ 8 comments… read them below or add one }
Good conclusion about the Hybird model and how it develops organically from a 100% on-premise model.
As a professional paranoid, I believe that if something is known, it can be managed/secured. The risk is greatest when users, unknown to the company or the people responsible for protecting the company’s revenue and reputation, do something like upload a spreadsheet so they can work from home, etc.
Javed,
As a professional paranoid, you must be aware of the fact that the users will end up doing just that (i.e uploading those documents to Google Docs). Why? Because it makes their job easy, which means it makes their lives easy and because they can…
Tsahy – I completely agree with this view of the world – seems realistic to me – all companies will become hybrid, whether they plan for it or not.
The data leaking problem (Data Leakage 1.0) has been around ever since users could print a document or write it to a floppy (data was leaving the premises untracked). The data leaking problem shifted somewhat once employees could email documents to and from their personal email accounts; we might notice data leaving the premises (as it passes through a company proxy server) on its way to living on systems where you have no control – hotmail, yahoo, google, ISP, etc. – and is possibly intercepted/viewed by unknown others as it passes through cyberspace unprotected – Data Leakage 2.0.
I guess we are now on Data Leakage 3.0 – users can outright manage their documents collaboratively online – these are not necessarily copies of the “real” ones, but rather the de facto system of record – all out in the open as on Google Docs – and we all need to hope we aren’t exposed, leaking, indexed, hacked via flaws in The Google – and hope our users are managing this data with eyes open and best practices!
“Fight it or embrace it?” becomes the question.
Data Leakage 3.0…I like this one!
Here comes the problem, you notice that Google charges you $0 for application services (Google docs for example). In a real world nothing comes for free, so you pay the price by either swallowing ads that sometimes pop-up, and you pay the price by not knowing where your data is and who has an access to them, etc. That is the price of the ‘free’ services you have to accept or not. I am not surprised at all that those kind of issues exist even in the beloved Google SW. No one is perfect, and Google is not any different, although I admit they’re very good.
I believe that their search engine, although the best on the market and surely widely used, is also hitting some limits. All I’m hearing is that less and less people trust the Google search results, and they all rephrase the same question: “Am I receiving the best search results according to my query, or am I receiving the best results for Google or for somebody else?”
Do you think that Google apps are not good for you? OK, stop lamenting and change the provider or DIY.
Cheers, M.
I like the idea of ‘Hybrid Infrastructure’ and 80/20 model. Obviously it’s our reallity for next couples of years.
Re security, I believe it’s more relates to how employees in corporates works with information rather infrastructure. If you used to keep you hard copies in your case and case has been stolen: who’s to blame?
BTW take a look on how we partly resolve this issue with our MS Office Addon – Office In Cloud: http://www.upriseapps.com/index.php?option=com_content&view=article&id=8&Itemid=8
Milan,
For business users, the cost isn’t $0 but rather $50/user/year and you don’t get all the Ads etc.
As low as the cost is, the potential cost of having some of that data leak etc is potentially very very high…and so the issue is less about complaining more so about the fact the Google Docs access management for a business, needs to managed by IT resources, IT management apps etc
Andrew,
Your company application is exactly the reason why people will end-up having a hybrid model!….