Archive for the 'php' Category

Managing a large codebase

Tuesday, July 17th, 2007

Anyone who has worked at an organization with more than a few developers for a reasonable period of time would have felt the pain associated with a growing codebase. The word ‘legacy’ creeps into the everyday language and the number of maintenance tasks soon exceeds the amount of time spent writing new code.

The maintenance tasks hopefully make your existing clients happy (‘serve the client in front of you’) but the reduction in new code generally means less new features per developer and therefore is naturally linked to a reduction in your ability to increase the amount of product you can sell to your clients.

There is of course a correlation with the amount of code you write to the amount of maintenance it requires however exactly to what factor this is depends entirely on your processes and how they affect the technical debt you incur.

Relish deleting code

My first piece of advice to any developer (you never know when your own codebase will become a multi-developer maintenance monster) is to relish deleting code. Most code, after-all, is the application of standard algorithms and patterns to specific problems and is therefore not that useful or unique once it is no longer required.

Don’t keep code around just in case. You have it under version control so as soon as it is orphaned then delete that sucker. Have an active deprecation process that is regular and ruthless!

Campaign actively to deprecate unused functionality within your organization as well. The temptation to keep functionality around just in case is not reserved to developers; product development, sales and marketing all fall into this trap.

It is costly to leave unused functionality and code intact because it costs an organization in a number of ways:

  • Developers have more complexity to deal with and this will always result in waste.
  • Users have more unnecessary complexity which affects the usability of your product which in turn affects how your product is percieved.
  • Technical debt is incurred over time - its like continuing to pay rent on an apartment you already moved out of.

The global namespace is not your playground

Making changes becomes the focus well before the codebase even reaches the inflection point of maintenance outstripping new code. This is because many features of any given software product are built on top of existing features.

The enemy of change is the dependency and the easiest way to create unnecessary dependencies is to create globals because if its in the global scope then other coders will use them. Declaration scope and JIT inclusion of necessary dependencies are your friends – use these wisely.

Those entry points into your codebase that are necessary because they are utilitarian or because they kick your application off should at least be namespaced off into a structure by using a pattern such as the Singleton. Don’t be fooled – a Singleton is still a type of global but it is much easier to attach documentation to and control the signature for it.

Divide your code into layers to assist in reducing coupling and avoid having lower layers called directly from layers that are not immediately above them. For instance – if your page or front-controller calls your database directly you will find you reimplement the same query creation code, query execution and object population code over and over. Its much better to abstract this functionality to an object hierarchy which can specialise in these tasks as there is nothing useful in seeing lowlevel logic scattered through-out the logic behind your presentation.

Much of this advice is available from a variety of sources and I do recommend reading up further on the topics I have mentioned if you found that some warning bells with your own organisation’s codebase started to ring. Codebases can become massive – particularly when their are multiple developers involved multiplied by a few years of time. Some continual investment in keeping the house clean will pay off by allowing you to spend more time on new code.

WordPress and its two faces

Friday, July 6th, 2007

There is no denying that feature-wise WordPress offers everything that an amateur or even a professional blogger could need.

Unfortunately, something I always suspected of WP but never wanted to admit was internally there was a level of chaos which might spill out and hurt me.

Well last night, hurt me it did. I am not sure whether an edit to the theme, a new plugin or a recent upgrade was responsible but I noticed the feed was no longer validating and various applications were having trouble with this including Feedburner.

Feedburner’s FeedMedic led me quickly to discover an errant white space at the start of the file.

Familiar with PHP as I am I thought that following the execution logic from the index page (to which passing a parameter would yield the feed to be returned) should eventually locate the mischievous white space.

The first few files seemed straight-forward and single purpose however the more I dug, the more apparent WP’s internal discord became. Globals, functions everywhere (including a whole stack in a file called ‘functions’), procedural code and the occasional class that of course gets instantiated into the global namespace.

After about an hour of checking through the end of php files trimming white space and double-checking every echo I could locate I decided to take another tack. I tried to create a simple file which just recreated the variables I needed for the feed file to work.

This work revealed a very deep hierarchy of requires – as I added in one require it would require further dependencies which as I proceeded seemed less and less relevant to the work required to render the feed.

Finally I realized that WP was not going to let me debug this issue in any sort of reasonable time (without setting up a step debugger or retracing my steps earlier with echoes until I located my space) and I should just accept that a space was being buffered and I should try clear it just before I echoed my feed.

Some quick research on php.net and the only function which looked like it might clear the default output buffering php uses was ob_clean. I’d always thought this would only apply to output buffers I’d explicitly setup but it seemed to do the trick for my feed.

So what to conclude from all this?

I believe blog software will continue to evolve – but I wonder whether WordPress be able to keep pace with all its technical debt?

That a space can break an application is somewhat of a flaw in PHP. (Using a template system with PHP helps avoid that regular case of unwanted output creeping in after a close tag.)

That it took a few hours to get a resolution on this issue is a critical flaw with WordPress. Whilst this doesn’t seem to have hindered what appears to be the healthiest plugin support of any blogging platform I feel it will ultimately limit the competitiveness of WordPress as better architected solutions which can foster a dedicated plugin development community similar to WordPress come to the scene.

Blurring the line between code and documentation

Wednesday, January 24th, 2007

The line between code and documentation is blurring with the introduction of regimented documentation standards. In PHP reflection (and likely other languages as well) you can use reflection to programmatically access documentation blocks meaning you *could* make code dependant on information in your doc comments! Scary!

More scary is that I discovered this via a colleague who was considering using this. You could argue it conforms to the DRY principle because we were looking to add more strict control of return types implemented in methods that are derived from an interface we defined. This would require us to redefine information already part of our doc comments should we proceed without the reflection API.

For now we have elected not to proceed because it violates the rule of least surprise. Developers do not expect edits to comments affecting code conditionals and thus our effort to enforce correct use of interfaces could backfire causing exceptions in classes implementing our defined interfaces.

At some later date when the trend has continued and there is no line between code and API documentation or at such point we are confident our unit testing will prevent deployment of exceptions due to incorrect documentation we may reconsider this approach.