Tuesday 24 April 2012

Renewals available now!

Hopefully you'll have already seen the announcement on one of our many communication channels such as our website, Facebook page or Twitter feed but if not then read on.

Many of you will remember the changes we brought in last year in April 2011.  Due to funding restrictions, we had to reduce the CPU allocation of all users to a maximum of 2000 free cpu hours in one year.  You can read the original announcement on our website.  As we are now a year on, all NGS users can apply for another free 2000 cpu hours.

If you are looking for some proof of concept computing, a "sand pit" area for your PhD students or to test concepts before purcahsing more hours etc then this is an ideal opportunity.

If you have any queries at all then don't hesitate to contact the NGS helpdesk.

Monday 16 April 2012

"hello, science\n"

It is worth pondering how scientific programming is different from other programming. Last year I gave an introductory talk on specialised languages used for science (in which I include Fortran but mainly covered R, APlus, and suchlike). How do you do "hello, world" in science?  It has to be floating point, so I picked calculating the length of a vector.

Let's just digress for a second to do that. Say I want to calculate the length of (vi); I can then start with s=0 and loop over i, adding vi2, and finally take the square root of the sum:

my $s=0; foreach (@v) {$s+=$_*$_;} return sqrt($s);

Or we can do it more functionally, creating a new vector of squares ("map"), the elements of which are then added together ("reduce"):

(sqrt (reduce #'+ (map 'list (lambda (x) (* x x)) v)))

... which is the origin of the MapReduce paradigm, but it has the disadvantage of creating a temporary copy (here a list) of the squares. But. If you are doing them in parallel, with each task squaring its own entry (which you might if v is large), in this case you do need to keep the intermediate results anyway.

Then there are questions of precision and suchlike, for which David Goldberg's paper is still one of the best introductions. This is in contrast to "normal" programming, where one should read Zen and the Art of Motorcycle Maintenance (but see also 10 papers).

We can then ask how science use of * is different from normal use of * (where * is anything). Do scientists use the cloud in a different way from non-scientists?

With this in mind, JISC and STFC co-organised a workshop on scientific computing in the cloud (and grids.) Funded by EPSRC, and with about 75 registered participants and 15 speakers from the UK and beyond, it focused on the science use of cloud (and grid) resources. There were a number of discussions on cost effectiveness, cost models, and the true cost of doing science in clouds compared to your own (university's) resources. How careful should you be about putting your data "in the cloud" - and here we are just talking about analysis of data, not long term storage. How do you convince sceptical users?

It seems that some of the lessons learned from the grid carry over to the cloud world: the use of gateways and portals is a useful way to get researchers started using the cloud, but then someone needs to build these things for the research communities - and they will in general be domain specific. And building these cannot just be a proof of principle; they have to be production ready and supported.

Of course e-scientists have scientific applications, specialised libraries, and repositories of libraries - and every e-science programmer should know their BLAS and LAPACK... on the other hand, the presence of gateways and portals brings hope to the "ordinary" researchers who want to make the most of the brave new world of the fourth paradigm but are not themselves programmers and choose (rightly) to focus on their science.

Science use of clouds may have learnt from science use of grids, but clouds also introduce new issues. We agreed at the workshop that it was worth pursuing the case studies. There was no single "pain point" for everyone, but everyone learnt from each other. Supporting scientific research in the clouds (and grids) is a research topic in its own right, bringing together computing, science, best practices, usability, security, performance, and more - and as long as we continue to share experiences, the researchers who use the infrastructure will benefit.

Wednesday 11 April 2012

NGS at the EGI Community Forum, Munich


You can tell it’s been conference season over the past few weeks – lots of travelling, lack of sleep and notes written in cryptic language on my laptop from various sessions and presentations.

The EGI Community Forum was held in Munich, Germany at the end of March and consisted of 4 days of conferencing and various workshops on the Monday morning.  As well as helping to look after the UK NGI exhibition stand, I also attended a wide variety of interesting sessions including:
Each session had its highlights – the EGI session looked at how to count the number of users that EGI actually has.  They attempted to do this through the use of VOMS (Virtual Organization Membership Service) but there were problems with the information contained being out of date or indeed missing in some cases, not all users being registered, expired users still being in the system and many more.  However at the end of the day they did eventually come up with a definitive figure which was as accurate as possible on the day it was calculated – 20706.

Also in this session was a presentation from the German national grid – D-Grid.  They presented on a business model for a sustainable Grid infrastructure.  The slides from this session are definitely worth a look for anyone interested in the next stage of national e-infrastructure.

I also presented on the NGS Campus and Community Champion initiatives at the NGS in the session on Communication.  To save me telling you all about it, I’m instead going to provide a link to a blog post written by Elizabeth Leake who wrote about her take on the session and my presentation.

A big congratulations to the local organisers who did a fantastic job – great venue and great food as well as inbuilt entertainment in the conference venue.  You may have to join our Facebook page to see evidence of this coming soon!

Munich was another great EGI event and we’re already planning and looking forward to the next one which is the EGI Technical Forumin Prague in September. 

Wednesday 4 April 2012

Doing a lot of talking about software


Recently I attended the Software Sustainability Institute Collaboration Workshop (CW) which was held in a very sunny Oxford for 2 days.  It was a busy workshop for me due to being on the steering committee, being part of the events team, chairing a session, giving a lightning talk and scribing for some of the sessions as well!

If you’ve never been to a CW before then the best way to describe it is a conference but not as you know it!  In most conferences people sit and listen to one person giving a demo or PowerPoint presentation at the front of a lecture theatre.  At a CW people pick the topics they want to discuss and head off into break out rooms to have stimulating and interactive discussions about these topics.  Everyone then reconvenes in the main lecture theatre and all the groups report back to inform all delegates of the points and issues raised as well as some possible solutions!

However before the breakout sessions there were some lightning talks – short presentations done against the clock.  Simon Hettrick from SSI makes sure that there are no misunderstandings as a large countdown timer is projected up on the screen along with the one and only slide you are allowed.  I have done lightning talks before at SSI events but this time Simon had raised the bar by only giving each delegate a mere 3 mins.  As I had to present on both the Campus and Community champs in this time it was a tall order but I made it – just!

After the adrenaline rush of the lightning talks we moved onto the more sedate business of breakout sessions.  During the two days I attended several sessions -

•    Building research and communication networks across disciplines
•    How to blog, and how to run a blog 
•    Bringing together representatives of the research community: Institute's Agents and SSAs, and the SeIUCCR Community and Campus Champions
•    Using the internet and social media to increase your impact and publicise research to the public and research community 

From each of these sessions the 5 most important points learnt during the session were recorded and reported back along with -

•    What are the problems, and are there solutions?
•    What further work could be done, and who should do it?
•    Are there any useful resources that people should know about?

All the notes from all the sessions are available through the Collaborations Workshop 2012 Google Group – you don’t need a Google account to view the information.  They make for some very interesting reading particularly if you are a research software engineer or a researcher who uses software!

Photos from the event are also available which prove just how nice the weather really was before we descended back into winter this week!