Posts Tagged ‘stats’

Who should survive the Euro 2016 groups?

Saturday, June 18th, 2016 | Sport

euro-2016

When you make it to the finals of Euro 2016, you are allocated to one of six groups. The top two teams from each group, after three matches, get a pass into the round of 16. An additional four teams also make it through.

But should we expect to see there? One way to calculate it is to work out how tough each group is. By taking each team’s world rankings and averaging them, we can see how tough each group is.

Group Teams Rank
A France, Switzerland, Romania, Albania 24
B England, Wales, Slovakia, Russia 22.5
C Germany, Poland, Northern Ireland, Ukraine 18.75
D Croatia, SPain, Czech Republic, Turkey 20.25
E Italy, Ireland, Sweden, Belgium 20.5
F Hungary, Ireland, Portugal, Austria 18

A lower number represents a tough group and a higher number represents an easier group. This suggests England is in one of the easiest groups: only group A has a lower average world ranking, and this does not factor in the home-field advantage that France has.

We can then compare the team ranking to the average group ranking to see who should find it easiest to quality.

Team Difference
Belgium +18.5
Germany +14.75
Spain +14.25
England +11.5
Ireland -12.5
Sweden -14.5
Iceland -16
Albania -18

I have shown the top and bottom four here. The other home nations do not find themselves at the bottom of the table, which is positive news too. Having said this, the difficultly of the group makes little difference: all of these teams are in the same order as if you had just taken the world rankings. So how tough your group is, is probably not a factor.

What kind of food does Leeds eat?

Wednesday, December 23rd, 2015 | Food

Following on from my previous post looking at statistics we can pull out from the Leeds Restaurant Guide dataset, I wanted to look at how the restaurant scene has changed since we first published the guide.

Here it is:

chart_cuisine_per_edition

In this graph, I have plotted each cuisine type against the number of restaurants. This is shown for the 1st edition (2013), 3rd edition (2014) and 5th edition (2015). As we learned in the last post, the number of restaurants has risen, so in general, we would expect most categories to have grown between each addition. I have not included pub grub as the size of it makes the rest of the data difficult to see.

For the most part, this holds true. Some cuisines have grown faster than others though. We have seen a rise in restaurants serving American, British, International (those that serve food from all over the world with no real speciality) and steak.

In other areas we have seen a decline though. Buffet, French, Indian and seafood have all seen a decline. Persian has too, but this was always a small market. The biggest change is possibly Chinese restaurants. In the first edition we had seven Chinese restaurants, now we have only four.

In terms of the most popular cuisines, Italian remains king. When we first wrote the guide we even considered splitting Italian into two categories, one for general Italian and one for restaurants that specifically did pizza. Latin is also very popular thanks to the growth of tapas bars. It used to be equally as popular as Indian, but Indian has since fallen away.

We can draw the most popular cuisines in a table. I have omitted hotels and casinos, and international because these do not really tell us anything about people’s tastes.

Position 2013 2015
1 Italian Italian
2 Latin Latin
3 Indian British
4 British American
5 American Indian

It is a pretty consistent story. The only change is that Indian has dropped off from a joint-second spot in 2013 to now being 5th, behind British and American. Much of the growth in these categories is down to meat places such as burgers and BBQ so it could be people are looking towards more meat-heavily dishes in recent years. Or it could also just be random chance. The sample size is not that big after all.

Leeds restaurants in numbers

Tuesday, December 22nd, 2015 | Food

Earlier this month I launched the 5th edition of the Leeds Restaurant Guide. Now, with five editions behind us and several years of data, I decided it would be interesting to see what we could mine from that information.

Number of restaurants

You might expect the number of restaurants in Leeds to be going up. It is, but only slightly.chart_restaurant_count

This graph shows the total number of restaurants. Over the past two and a half years the number of restaurants has increased 10%. These are not the same restaurants though. It is a case of them opening faster than they are closing.

chart_additions_closures

This graph shows the number of new restaurants opening and old restaurants closing between each edition. Restaurants have consistently opened while closures have been more sporadic. It is worth noting though that the release of each edition of the guide has not been equally spaced, even though it is shown this way on the graph, so that distorts the picture somewhat.

How we rate

Most restaurants are likely to be middle-of-the-road, with some not so good restaurants, some very good restaurants, and a few poor and excellent restaurants at either ends. So what happens when you plot frequency against rating?

chart_ratings_count

Ah, just what we wanted: a beautiful bell curve! Two is a little low for a perfect curve, but normal distributions are often imperfect in the real world. This suggests to me that our ratings are consistent with what you would expect from restaurants running in the free market.

That only shows data from restaurants that are still open. What about restaurants that have closed?

chart_ratings_closures

What we would expect to see here is a little less clear. Perhaps that 1-rating is the highest as poor restaurants should close the most. But given there are some many 3-rating restaurants, this might not be the case, and you may have to adjust it for frequency to see such a result. As it is we have another bell curve.

There is a clear asymmetry in the graph though. Far more 1-rating restaurants close than 5-rating restaurants, and far more 2-rating restaurants close than 4-rating restaurants, indicating that our ratings are broadly consistent with where the market chooses to spend, or not spend, it’s money.

What type of food is the best?

What cuisine produces the highest standards? Can you provide any correlation between the type of food and how good a restaurant is?

chart_ratings_by_cuisine

This graph shows each cuisine type and the average rating it receives. No category can maintain an average rating lower than 2 or higher than 4 because no range of restaurants can be that consistent.

I was not surprised to see Thai so high up. Steakhouses are also typically on the higher price range, so score well (though we do factor in price to an extent when awarding ratings). Chinese scoring to high is mostly a result of the less nice Chinese restaurants closing down.

The number in brackets after each cuisine indicates the number of restaurants in that category. So the ratings for Persian, German and seafood are pretty meaningless because it is based on a single restaurant.

What useful information we can draw from this is less clear. Just because the average restaurant scores well or poorly does not mean that all restaurants will. There are bad Thai restaurants for example (actually, there aren’t, but there used to be one) and good Indians (lots of them!). However, if you were to avoid eating at new hotels, casinos, fast food and pubs based on it being unlikely to be a good meal, few people would fault you for that.

Casting Light on Evidence

Wednesday, April 23rd, 2014 | Foundation, Humanism

For the April meeting of Leeds Skeptics, Dr Paul Marchant presented a talk entitled “Casting Light on Evidence … & Evidence on Light”. The talk looked at how data is used on public policy making, to varying degrees of success.

IMG_4036 IMG_4038 IMG_4039

A Skeptical Look at Statistics

Friday, December 6th, 2013 | Foundation, Humanism

Last month John Fletcher presented a talk entitled “A Skeptical Look at Statistics” at Leeds Skeptics. It was great to see people there who were really interested in stats. It was also the first event we have held at the Hedley Verity and while it isn’t perfect, it is certainly an acceptable backup venue.

IMG_3109 IMG_3110 IMG_3113

Manually update Awstats on cPanel

Wednesday, May 9th, 2012 | Life, Tech

Awstats is updated by cPanel once per day, but if you want to force a manual update, you can do so with the following command.

/usr/bin/perl /usr/local/cpanel/3rdparty/bin/awstats.pl -config=example.com -update

Poker Stats Library

Tuesday, August 16th, 2011 | Tech

A few weeks ago, I wrote some tools which would help me out in getting to grips with poker, which in general I fail at.

It annoyed me because it should be fairly simple for someone like myself to get my head around the poker maths (well, it is, pot odds are easy), so even despite the lack of social understanding the life of a computer scientist brings, I should at least be able to achieve a level of averageness in the game. I clearly have failed to do this, and so I decided a bit of work on my basic strategy was needed.

As a result, I built an interactive tool which would teach me what starting hands I should play, similar to the concept of Basic Strategy in blackjack. It presents you with two cards and you have to say what position you can play them from, if any. It will then tell you if you are correct or not, if not it will ask you to try again and if so, it will move on to the next hand.

I also wrote a tool which allows you to select the cards you have, and using the same formulas it will tell you what position that hand is worth playing from. I’ve thrown in a few other simple odds calculations in there as well.

Of course, these won’t make you a great poker play by themselves, but it should provide a good basis to learn from.

Given the tools would otherwise just disappear into the depths of my hard drive somewhere, I’ve decided to publish the code on Github. Should you have any interest, you can download the source from the Github repository. It’s all written in PHP and should run out of the box.

Btw, the images below are screenshots, but the way they have been scaled down looks rubbish. They make more sense when you open them…

Best search terms

Saturday, June 4th, 2011 | Distractions

I recently took a quick glance at my stats to see, well I would say how many people read my blog, but what you really get is how many spam bots have hit your blog. In any case, it was interesting to see some of the search terms that people have used to reach my blog:

  • dogging
  • red light area in london
  • hamster birthday
  • chicken brain
  • osama bin laden death photo
  • sue my chin buff my pylon
  • dont buff my pylon
  • cottaging blog
  • daily star they ve stolen all our jobs
  • bejeweled illuminati
  • talk to dead ancestors
  • water way to have a good time

Dogging I can understand, though I imagine people will be quite disappointed in the content they find when they get here. Chicken brain comes from the time we went to Nando’s for my birthday and found a chicken brain in our food.

The buff my pylon stuff is a reference to Brass Eye while the last term is a reference to Alan Partridge, though the blog post itself is nothing to do with that.

The Daily Star reference refers to a headline they ran in 2008 claiming immigrants had taken every single unskilled job in the past few years.

Beyond that, I’m a little lost though. I don’t have any pictures of Osama Bin Laden’s dead body, I’ve never tried cottaging and I’m fairly sure that Bejeweled is not a product of the Illuminati designed to control our minds. And even if I did, I certainly haven’t expressed that opinion on my blog!

EDIT: Actually, while I didn’t say that, I did suggest PopCap might be the new Illuminati.

Log your visitors to a text file using PHP

Sunday, September 16th, 2007 | Programming, Tech

Not all web hosts grant access to web server logs. And even if you do have access to them they may not be that useful. The solution is to write your own script which will log your own stats. And it can all be done without the use of a database. Though databases are great for this not everyone has a spare one so I am going to use nothing more complex than a text file to show the information.

The logging file

The first thing we need is a file to record each hit and log the information in a text file. This is done by creating a function and adding all the variables into it. We can then add information for these variables later in the tutorial.

<?php
function logthis ($sessionid, $pagevisited, $ip, $browser, $refer)
{
$log = fopen("log.txt","a");

$countryfile = fopen("http://ip-to-country.com/gert-country/?ip=$ip&user=guest&password=guest","r");
$country = fgets($countryfile,50);
fclose($countryfile);

This sets up the basic information for the log. I also used a tool which allows you to send an IP address and the site will return the country which that user is from. And therefore we can log which country each visitor comes from.

$now = date("d F Y h:i:s A");

fwrite($log,"$now,$sessionid,$pagevisited,$ip,$country,$browser,$refer\n");
$log = fclose($log);

}

?>

This code writes in the information. Well the top bit gets the date. But the second part at least does the interesting things. All the variables are lined up and the data is separated by comma’s to allow it to be analysed later. There is also the \n to indicate a new line should be gone to after the data is written in.

That is all the code for the logging file. Save it as log.php. We are now done with this file so you can close it down as everything else will be done from the other pages.

Code for the pages

For each of the pages you want to log the stats on you need to insert some code above the <html> tag to allow us to track the visitors to that page. So you will probably want to insert the code into all your pages.

There are two parts to the code that needs to be inserted into your pages. The first includes the log.php file into your page so we have the function. The second gives all the information to be included.

<?php

require 'log.php';
session_start();

logthishit(session_id(), $PHP_SELF, $REMOTE_ADDR, $HTTP_USER_AGENT, $HTTP_REFERER);

?>

After the include line there is a line telling PHP to start a session. Sessions are new to PHP4 and allow each user to be treated individually so you can work out when duplicate users are visiting different pages.

This code can be pasted into all your pages and remain relatively unchanged. The only bit which will need some tinkering with is the path to log.php. So for instance if you had a page in a games folder and log.php was in a folder called stats you would need to change log.php next to require to ../stats/log.php.

A few small tasks

You are almost done! Just a few small things to do and then we are finished. First open up your text editor, Notepad is fine, and save a blank file as log.txt in the same folder as you saved log.php. Once that is done upload the two files and any files which you added the code into to your web space.

It’s best to upload your files to the root directory of your website if you can. You then need to make sure that that text file has read and write properties. It may have already although if it doesn’t or your not sure then make sure by right clicking on it in your FTP client and look for a properties menu or something similar. This is usually how it’s accessed though it may vary depending on your FTP client.

Finally there is one more thing you may want to do. If all your files use .htm or a similar extension and you can only run PHP scripts on .php pages you will have a problem as you may not want to rename all the files. So if you can’t rename the pages and PHP scripts don’t work in .htm pages you need to edit your .htaccess file.

If you don’t already have one then you can copy the following code into a blank text file, save it as .htaccess and upload it to your web space. If you already have one then download the current one and add this code or modify the existing code to look like this.

AddType application/x-httpd-php .php .html .htm

You can add any other extensions you use to the end of these too.

Analysis & Conclusion

Now your server log is complete and the script will begin counting all your visitors and saving them in log.txt. When you want to view the log all you have to do is either point your browser to the file or download it using your FTP client and open it in a text editor.

That is your basic view, however if you would like something more complex then use a spreadsheet application such as Excel. You can open log.txt up in a spreadsheet and it should display fine as we added in the comma’s to separate the data.

You can also use the AutoFilter which can be found in the Tools menu at the top of Excel so you can select one piece of data to filter in the logs such as one users session id or one browser to display all the data from.

Now you not only have great logs but they look shiny too.

A simple hit counter in ASP

Wednesday, December 29th, 2004 | Programming, Tech

This tutorial will show you how to build a simple hit counter. It does not use any SQL or databases; it stores the hits in a text file.

Allou need to create for this script is your ASP file and a text file. In the text file, simply enter the number 0 and save it in the same directory as count.txt. Take a look at the basic source code.

<%@ Language="VBScript" %>
<% Response.Expires= -1
Response.AddHeader "Cache-Control", "no-cache"
Response.AddHeader "Pragma", "no-cache" %>
<%
if Session("ct") = "" then
fp = Server.MapPath("db\count.txt")
Set fs = CreateObject("Scripting.FileSystemObject")
Set a = fs.OpenTextFile(fp)
ct = Clng(a.ReadLine)
ct = ct + 1
Session("ct") = ct
a.close
Set a = fs.CreateTextFile(fp, True)
a.WriteLine(ct)
a.Close
Set a = Nothing
Set fs = Nothing
else
ct = Clng(Session("ct"))
end if 
%>

Now lets break it down into three sections.

<%@ Language="VBScript" %>

This just states that the page is a VB script page.

<% Response.Expires= -1
Response.AddHeader "Cache-Control", "no-cache"
Response.AddHeader "Pragma", "no-cache" %>

This section stops the user refreshing the page to clock up hits.

<%
if Session("ct") = "" then
fp = Server.MapPath("count.txt")
Set fs = CreateObject("Scripting.FileSystemObject")
Set a = fs.OpenTextFile(fp)
ct = Clng(a.ReadLine)
ct = ct + 1
Session("ct") = ct
a.close
Set a = fs.CreateTextFile(fp, True)
a.WriteLine(ct)
a.Close
Set a = Nothing
Set fs = Nothing
else
ct = Clng(Session("ct"))
end if 
%>

This is the main section which adds the hits.

fp = Server.MapPath("count.txt")

This tells the server where to find the file. You can modify the file location by changing count.txt. So for instance if you wanted to to be called hitcounter.txt and in the directory db you would use:

fp = Server.MapPath("db\hitcounter.txt")

All you have to do is alter the file path in the quote marks.

Set fs = CreateObject("Scripting.FileSystemObject")
Set a = fs.OpenTextFile(fp)
ct = Clng(a.ReadLine)

This section opens the file using FileSystemObject and reads the first line. It then sets the variable ct to the amount of hits it has already had.

ct = ct + 1

This line adds one hit to the total number of visitors.

Session("ct") = ct
a.close

This part saves a session variable as the new click through with the new amount of visitors and closes the text file.

Set a = fs.CreateTextFile(fp, True)
a.WriteLine(ct)
a.Close

This code creates a new text file over the old one and adds in the new amount of visitors to it. Then it closes the text file.

Now you have a working hit counter. All you need to do is add the hit counter into your page:

You are a visitor number <%=ct%>!

This would display the amount of visitors. Now to save confusion, here is the full source code for the page:

<%@ Language="VBScript" %>
<% Response.Expires= -1
Response.AddHeader "Cache-Control", "no-cache"
Response.AddHeader "Pragma", "no-cache" %>
<%
if Session("ct") = "" then
fp = Server.MapPath("db\count.txt")
Set fs = CreateObject("Scripting.FileSystemObject")
Set a = fs.OpenTextFile(fp)
ct = Clng(a.ReadLine)
ct = ct + 1
Session("ct") = ct
a.close
Set a = fs.CreateTextFile(fp, True)
a.WriteLine(ct)
a.Close
Set a = Nothing
Set fs = Nothing
else
ct = Clng(Session("ct"))
end if 
%>
<html>
<body>
You are a visitor number <%=ct%>!
</body>
</html>