The July-August report to the Community on Audit progress. These reports are done on an approximately two month basis, or on a milestone. Previous. Next.
1. Critical Systems
Critical Systems is the blocking element of the audit. |
The audit result is driven by progress in Critical Systems. |
1.1 Migration of Critical Systems
a. The Board of CAcert Inc has created and approved a new plan to re-build the critical systems in Netherlands, with a new team of (Dutch) system administrators.
b. Background. The strategic and audit requirement is of two parts: to rebuild the systems administration team from scratch, and to move the data from Vienna to servers available in the Netherlands secure facility. More: In managing the CAcert critical systems, there must be a team with multiple people. This is firstly an audit requirement for dual control / 4 eyes, secondly it is industry practice (c.f., San Francisco City network), and thirdly, there is too much work for one person to do alone. The work must be shared out, and one person doing all the work soon drowns and loses objectivity.
c. When the migration happens, CAcert will lose service, perhaps for several days, and perhaps repeatedly. Today's pencilled in date is from 30th September. As the new team ramps up, there may be significant shortages. This is unavoidable, and indeed it will be a relief to see: finally some professional choices being made. The price of a scaleable and professional CA service is that hard choices have to be made.
d. See 1.d in last report.
1.2 Audit Fail for Vienna CA
a. The Vienna situation was always fraught, always intended to be temporary, even emergency. It inherited many difficulties from the Australia situation.
b. While much bug-fixing was done, and some features were added, it is clear that the big picture did not advance over the period. E.g, dual control over access was bypassed, not improved. The team did not grow, focus was always on short term bug fixes, not more people to help. Etc.
c. For these reasons, it is beyond time to declare the Vienna CA as an audit fail. As every attempt to address the issues has met with roadblocks, it is unclear that any method can ever be found to fix that CA.
d. Then, there is a big and serious question: should CAcert run a CA that can not be audited? This question was put to the board, and the board has just approved the plan to shut down the Vienna CA, regardless of the success of the migration.
1.3. Audit Fail for Roots
a. The current set of Roots also cannot pass Audit. This message was given to the Board of CAcert at the September 2007 'top'. The Board agreed and some timelines for creating new roots were discussed then.
b. A little outline: the problem of the Roots is this.
- There is little documentation as to how the existing roots should be secured.
- There is no clear history of the security of the roots. This relates primarily to the Australia period, where there is little information as to what, where, when and how, but also applies to Vienna.
- Sanity checks do not change the story. There is an unclear situation with passwords, backups, and escrows of roots. Each of these assets has little documentation, little consistency, therefore no clarity nor auditibility.
c. So we need to create new roots. However, there is no point in creating new roots without sorting out the issues addressed with the existing set of roots. Therefore, we need:
- a new structure of roots for future,
- documentation as to how the roots should be secured,
- history maintained as logs and announcements of any changes, etc,
- a clear escrow and recovery situation, and,
- a way to create new roots, within the environment of 1-4.
d. Finally: creating roots is dependent on a good and secured critical systems setup. So while CAcert will have to work through points 1-5 above, the actual implementation is dependent on enough success in the critical systems to warrant creating some new roots. In other words, creating 1-5 is highly important, but the actual creation of new long-term roots will not be happening until we have got the critical systems on a sound footing.
1.4. Security Bug Handling
a. A security bug was reported by a Member, Kriss Andsten. The actual bug was fairly constrained: a Member can add any email address in the .JP tld without proper checking.
b. Within around 12 hours of the bug being reported, the following happened:
- the software team had fixed the parameter checking and installed that in the online system.
- management team had met and discussed.
a decision was taken to turn off register_globals in the test system, study the results, and then turn it off in the main online system.
bug notification was filed as a blog post to the Community.
c. Far more embarrassing was the underlying reason: inadequate parameter checking and register_globals turned on, which in essence means that any attacker can set any variables that aren't checked. Blech. The fact that the PHP code is still using register_globals is rightly deplorable, and even laughable. Kriss says:
Given that outset, I think your codebase really isn't suitable for the task and disclosed why I think this to be the case and why I question the judgement of operating a CA on it, seeing it doesn't really allow for either good maintainability or a good security standard (register_globals in combination with these two is outright scary) |
We should see this opinion in two ways. Firstly, this establishes a baseline for the software development team to aim for: eliminate all such bugs. In practice, I suspect there will be more of these bugs to surface, and we hope that Kriss continues to probe and publish. Secondly, the fact that we see a response in place within a short time is a good sign. This is faster than other organisations, but perhaps CAcert was lucky this time: the fix was easy this time.
d. So one bad, one good. This is how it should be, there isn't any such thing as perfect or risk-free security.
e. It's worth taking a moment to discuss on the role of audits and management in security. Neither role is there to ensure that the systems are secure, the users are safe, and well tucked in by the time it is go to sleep. Instead, management is primarily there to document procedures, ensure that the documents are followed, and deal with the exceptions. Making the hard decisions is an exception, because the doco and procedures should have dealt with it already; when they don't, it's time for management.
f. Audit's role can be simply viewed as checking that management is doing that process, as above. Neither management nor Audit makes the systems secure: You do. You the members, the Assurers, the system administrators and other roles make CAcert secure.
g. Security is a moving target. There isn't an adequate, easy or cheap way for how to do security, but there are many bad security recommendations. Frequently heard is the rather pathetic advice of "don't ever fail", much beloved of the press and the cryptography world. Another is the "certification" or "branded audit" benchmark idea, which basically says that if you are audited according to Brand X you are secure, or if you have the xyzBlahBlah Certification, likewise. These emperors wear no clothes, as they do not deal adequately with a moving threat model.
h. Having said that, there is a security process. If the process seen above is not followed, and/or the lessons are not learnt, then we have an issue. Let's see whether CAcert can deal with the next security issue.
2. Policies
2.2 The CAcert Community Agreement
a. Important steps with the CAcert Community Agreement have not been completed. Checkboxes to Agree in the online account system remain undone. CAP forms are not yet installed? Members are not notified.
b. The primary cause for slowness here is that software development administration is stalled or diverted, same as systems administration. We basically have to wait until the critical systems issues have been dealt with, because all attention is shifted to solving that problem.
2.3 Other Policy Areas
a. The Assurance Policy has finally gone to DRAFT. This means it is binding on the Community. You should quickly review it in its DRAFT form.
b. Now the Exceptions need to be addressed: TTP, Super-Assurer, Junior, etc.
c. Also, now is the time to start the systems work to support the changes in the AP. The Assurance Policy gives authorisation for the CAcert Assurer's Challenge, so the system may switch off the older unchallenged assurers any time.
d. Arbitration is now dealing with what I consider to be its first important case: an account closure. This is an area where the policies deferred to a human intervention, because the issues are exceedingly complex.
e. Work is advancing on the Security Manual. It is now a document that can be referred to. It is not yet ready for a fuller review. However, it is not a blocking task, see 1. above. It is being filled in, as and when the new team find issues to add to it, e.g., Disk Drive Destruction
2.3 The CPS -- Two Killer Issues
a. Audit has now moved its primary policy focus onto the svn.cacert.org/CAcert/policy.htm wip CPS. The Certification Practice Statement is the document that takes the Assurance and shoves it into the certificates, and is the core document of all CAs.
b. Two big issues have emerged. Both are in the "audit fail" basket, so ignoring them is not an option. How CAcert resolves these two issues is now before the policy group, but let us know your thoughts. Look for the green.
2.3.1 Arbitrary Common Names
a. One is that while Assurance is strong, and provides (among other things) a reliable name for the CommonName or CN field within the certificate, it is not always used. Indeed, in one form of certificate, it is arbitrarily set without Assurance.
b. Mostly, these are the client certificates issued by Organisations to their employees. The setting of the CN in client-side certs is left to the O-Admin. While this person is a CAcert Assurer, and the Organisation is a Member, there is no requirement or process imposed on either in how to set the CN.
c. We might argue that companies won't do anything wrong, but that would be naive. Indeed, we have already seen that some people want limited liability companies to do Assurance, because they don't want the risk. We might legitimately ask, if such Assurers do something wrong, would they then walk away from the liabilities? Companies are there to do things that people won't risk; for this reason, Assurance is a human process, and Assurance is incompatible with trying to shift all risks away.
d. More practically, Members can't easily tell the difference. This suggests either
that the Members have to check what form of certificate they are dealing with, before they rely on a name that might or might not be good, or,
- that the Name is not necessarily Assured.
The former is impractical, the second breaks the no-discrimination principle of CAcert.
2.3.2 Domain / Email checking
a. Domain and email checking is currently done once, on adding the domain, by sending an email probe or ping to one selected email address.
b. There are several things wrong with this, and I'll list them.
- the audit criteria wants more checking. It specifically wants frequent checks and checks on domain expiry or transfer. There is always some room for manoeuvre in the audit criteria, but in this case, the criteria are quite fierce.
- email / domain checking is currently subject to some known issues to do with DNS spoofing and the like. Search on Kaminsky and DNS.
checking the domain or email speaks to control but not ownership. Ownership is always superior, because ownership gives authentic control, while improper control can be spoofed.
c. Some might argue that domains are too weak to be checked properly. Let's dispose of that: Such thinking is based on binary security, which is as dead as the dodo. All security is imperfect, get used to it. The proper way to deal with a weak check is to craft many checks that are less correlated with each other, and require more than one. Secondly, if checking domains is too weak by any automated means, then we have to find other means to do it. Which is it?
d. For all these reasons, something has to change. Domain checking has to be improved. The details are up to you, and the policy group is probably going to have to craft a framework. This in itself is not good news, because such issues need input from both technical and governance people.
3. Audit Admin
a. According to the original NLnet Timeline, we should have completed the major documentation sets by now. We remain behind on all fronts.
b. In November, there will be an Annual General Meeting of the Association. Also, in November, there is a major audit presentation. While Audit perspective is that at the end of the year, we approach a crunch point because of the systems issues, this means in practice that the systems have to be fixed by the end of October.
c. The next 2-monthly report should be around November.
d. No new major payments have been made from the audit budget. The systems work in Netherlands may make a call on the CAcert "work" component, and is authorised for that. Also, the "costs" component will probably be called upon in November to acquire some hardware.