Country report for Statistics Finland Introduction - Something about the past year Statistics Finland has moved all free of charge services to.net PX- Web (2015-2016). To move all the chargeable services, still some work has to be done with PX-Web user authentication. There is no easy pop-up way to set up a chargeable service with authentication using PX-Web. A gatekeeper application has been developed in statistics Finland for PX-Web: http://yritystietopalvelu2.stat.fi/pxweb/pxweb/fi/ We also opened our renewed Regional key figures service which now works wholly on top of the underlying PX-Web database tables. All the contents of the page, such as regional pull-downs, key figure listings, all the data shown and all the selection lists for the graphs, are created dynamically using PX-Web API. This is a great example how to exploit this API as straightforward and efficiently as possible: http://stat.fi/tup/alue/kuntienavainluvut.html#?year=2016&active1=sss Now all StatFin PX-Web tables are also displayed in our main website search: http://stat.fi/hae_en?q=births - Changes in content or usage In Statistics Finland, we have developed a new dissemination concept, where Statistics Finland produces the statistics, builds and maintains a PX-Web website with continuously updated content for the customers. See the examples later in this document. - Problems and/or good news PX-Web is a beautifully crafted piece of software. The server is stable and the user interface intuitive. The users simply love it. There is still a lot work to do to make PX-Web what it could be: some bugs have to be fixed and enhanced features to add. - How much of your publishing is in the database? Over 97% of the statistical releases have tables in the statistical database StatFin. (http://pxnet2.stat.fi/pxweb/pxweb/en/statfin/) Information and experiences - Which version of PX-Web do you use? Statistics Finland uses PX-Web 2015 and 2016 (the latest version)
- Do you use the API? The API is in production use by both Statistics Finland and external customers. - Any reactions from users on, for example, saved query? The saved query is also used by Statistics Finland and external customers. Adding the CORS setting would extend the usability of the saved query. The customers love the saved query. - Other reflections? The migration to the new.net version of PX-Web has, in our case, been relatively easy using PX-Edit and PX-Job (command line PX- Edit) What are your future plans (related to the programs)? - Update to new version The latest version of PX-Edit is already in production use in Statistics Finland. The new PC-Axis program (PX-Win) will be taken into use as soon as possible, this involves some testing and fine tuning. - Development Development of PX-Edit will continue as before. We are studying the source code of PX-Web to be able to take part in the development of PX-Web. We will also continue to have some resources set aside to beta test the latest versions of the PC-Axis family programs. Something you would like to discuss at the meeting? - Future plans for the PC-Axis family - What and when new features will be added to PC-Axis family programs - Future plans for the Graphics module - More features to the API? - The licencing issues for non-profit organisations. Contact persons - Lauri Hyttinen(director IT development manager) lauri.hyttinen@stat.fi - Veli-Matti Jantunen (planning officer, PX-Edit developer) velimatti.jantunen@stat.fi - Hans Baumgartner (planning officer, PX-Web, PX-Edit, API) hans.baumgartner@stat.fi - Kim Huuhko (planning officer, Graphs) kim.huuhko@stat.fi
Free of charge PX-Web databases by Statistics Finland Our main PX-web service StatFin has been powered by the new PX-Web.Net from the beginning of this year. (http://pxnet2.stat.fi/pxweb/pxweb/en/statfin/)
Paavo, the PX-Web service with statistical data by postal code area on population, housing, employment, education and income. (http://pxnet2.stat.fi/pxweb/pxweb/en/postinumeroalueittainen_avoin_tieto/ Eurostat main tables PX-Web service with key data on EU Member States, updated daily with latest data from Eurostat. (http://pxnet2.stat.fi/pxweb/pxweb/en/eurostat/)
International tables, the PX-Web service with key statistics and data by international data producers. (http://pxnet2.stat.fi/pxweb/pxweb/en/kansainvalisen_tiedon_tietokanta/) PX-Web database services produced by Statistics Finland for other statistics producers in Finland Statistics produced and disseminated by Statistics Finland for Trafi, the Finnish Transport Safety Agency (http://trafi2.stat.fi/pxweb/pxweb/en/trafi/)
Statistics produced and disseminated by Statistics Finland for the Finnish Tax Administration (http://vero2.stat.fi/pxweb/pxweb/en/vero/) PX-Web services produced for other organisations Statistics produced and disseminated by Statistics Finland for Finpro. The aim is to help companies attract foreign investments to Finland, become more international in their line of work and, more recently, promote Finland as an attractive tourist destination to a world-wide audience. (http://visitfinland.stat.fi/pxweb/pxweb/fi/visitfinland)
PX-Web: feature requests and bugs 1. Additive search doesn't work for long classifications in the selection boxes. 2. Return to original classification (aggregation functionality) does not work. The selection window is not populated by the original classification, it is empty, and --- Select classification --- is probably not the best text for return to original classification (this works in old PC-Axis). It must be possible for the users to view all the different classifications and return to the original classification of the table. Now you have to start the selection window from the beginning. 3. When PX-Web hibernates it has to re-index the aggregation files again and the API doesn't start until all agg files are indexed. Sometimes this takes over two minutes. Could it be possible to write the index as a file and only re-index when really needed? This would be more CPU friendly and so much faster to start up. 4. The logo should support multiple languages and clicking on it should link to the language dependent homepage. Also, a database-specific custom configured homepage should be supported. Organizations can have different logos for different languages, for example if the logo contains text. 5. MenuBuilder should automatically also start the index generation for the search. 6. Add API extension. List last e.g. 1-500 updated tables. List last updated since dd.mm.yyyy. 7. Attributes and cell notes are shown in the table as a triangle in the corner. Are all other footnotes implemented? 8. Please add a feature to the file based aggregation, that reads the agg and vs files from the aggregation directory and all of its subdirectories. This way a.vs file can have a directory with all of its.agg files included in the same directory. Thus it would be possible to maintain thousands and tens of thousands of aggregation files (like Statistics Finland does). 9. Some PX-Web generated log files lack the IP address. It is impossible to check from one log file who generated the error or what organization has retrieved the most tables. 10. Is it possible to support in saved query the splitting of the time variable into two variables?
Wish list: More options to dynamic references for API and saved queries: Dynamic references to all (not just time) variables Option to pick ALL also from variables which have the ELIMINATION definition Option until xxx same way as from xxx Both from and until to same variable o for example picking dynamically just municipalities from geographic variable which includes all different geographical area levels Fixed and dynamic reference to same variable o for example years 1990, 2000, 2010 and last three years Dynamic reference to used time variable values first and last should be available for title also Dynamic search queries extended into contents of files: o Option to focus search only to variable names and/or codes o Option to focus search only to value names and/or codes o These options could (and maybe should) be available also for users of saved queries? One saved query generated to each table which fulfils the search query. More intelligence for data handling regarding to graphics: o Pivoting of the graphs must be separated from the pivoting of the underlying table. o In charts in general you should omit variables, with only one selected value, from pivoting. The information about those values should rather be included in the title (or subtitle) than labels. o Pivoting of the line chart is hardly necessary. o Implementing logic when certain graph types are available and when not. Application should be able to ban wrong presentation methods and guide the user to use right presentation methods.
examples: o Bar charts shouldn t be available when user have picked more than one value from more than two different variables, because in bar charts you shouldn't join values from different variables together. In line charts that can be done. o Selected values should determine when to use vertical and when horizontal bar graph. Vertical bar chart at least as a default when more than one value selected from the time variable -> time variable for x-axis. If more values selected also from some other variable -> time is set as a grouping variable. Horizontal bar chart as a default when no more than one value selected from the time variable. o In bar charts you can have only one grouping variable and one variable which determines (different) bars within those groups. So when pivoting these bar charts, user should have only two options. To define which one is an outer grouping variable and which one is an inner one. o Similarly in stacked bar charts you stack values from one and only one variable to (different) bars determined by the other variable. So also stacked bars shouldn t be available when the user have picked more than one value from more than two different variables. o So also in the stacked bar charts there is only two pivoting options -> which one is the grouping variable and which one is the summed variable (and ELIMINATION defined value should never be included in the summed values). o Time variable should never be the summed variable in stacked bar charts. o Pie chart should be available only when user have selected more than one value only from one variable (not time). ELIMINATION defined value should never be among those selected values. o Pyramid should be available only when user have selected two values from one variable, and more than one value only from one other variable. o Point chart should be available only when user have selected two values from a CONTVARIABLE and those two values are placed as x and y axis of this chart. It doesn t really work quite right at a moment.
o Radar chart should be available only when user have selected more than one value only from one variable. o If user have selected more than one value from CONTVARIABLE, he or she should be at least warned, because charts with more than one unit are highly questionable. Point chart is the sole exception for this rule. o In different charts the restriction limits should rather be determined by selected values from individual variables than total number of cells (600 at the moment?) in data. Let s say some 10-15 different lines in line graphs or some 30-40 different bars in horizontal bars and so on. In addition these stylistic matters and extra functions would significantly increase usability of these charts: o Y-axis from zero option o Sort by values option for pie diagram o Don t show zero values in labels in pie diagram o Sort by values option for horizontal bar charts o Sort by values option (according to either sum or some other defined value) for horizontal stacked bar charts o Sort by time option is hardly necessary o In line and bar graph bolded 0-line o In x-axis the unit is not usually necessary o The edit options display for graphs could be open by default o Possibility to edit unit label o Option to have more predefined colour palettes, or even a possibility for end users to create their own colour palettes o Better logic for scaling window frames for y-axis (for example no negative axis values should be shown when no negative numbers exists in the data) o The grid lines should be automatically positioned according to the graph type (and the option to change those is hardly necessary) o In bar charts axis values can't be omitted randomly except for the time variable! o Option to define cycle (and starting value) for time variables shown value labels (for example every tenth year from 1900)
o Remove non-usable selections when in graph- or in table-view in PX-Web. Users get confused when non-valid selections are available on the screen. Extra functionality to saved charts (and tables): o Possibility to extend any preselected combination of variables with any selected combination of values (to be shown as a selection windows/pull-downs with) next to a saved table or chart (made from fixed selections to other variables): http://vertinet2.stat.fi/verti/graph/viewpage.aspx?ifile=quic ktables/keepalive/tables//monivalintatesti2&isext=true&lang =1&rind=1 o Possibility to select any number and combination of values from those selection windows/pull-downs. Or rather this should be an optional choice for each variable: https://vertinet2.stat.fi/verti/graph/viewpage.aspx?ifile=quickt ables/kuntien_avainluvut_2016/avainluku_m391&isext=true &lang=1&x=800&y=800&rind=5,6,7 o Ready-made pull-down for all other relevant conversions (copyable chart, printable chart, html table, Excel table, link to underlying database table with these selections, different language version etc.) for exactly the same selections as user sees at that that moment as a chart. Other matters: Option to focus search (of whole database) only to variable names and/or codes is missing from search function? Bug in search function (under selection windows) with long lists (over xxx? values) o How many values is selected is not shown o Additivity of those selections doesn t work o These long and often hierarchical variables are just the ones that needs this functionality in the first place! o That search function would be much more useful, if it would include basic logic operators as AND and OR etc. All links should be shown as links. Blue and underlined. o Footnotes vs Contact?
o Marking tips in selection window page Option to define own and better principles for preselected selection o For example ELIMINATION defined values from all classification variables, first value from CONTVARIABLE, and last five values from time variable, or something like that. That would create a relevant, logical table view contrary to picking few values from each variable. Extra confirmation asking page behind Download whole PC- Axis table link. So we could catch also these heavy users to our usage monitoring logs. Extra option to saved queries: Predefined selection at selection window page. For example, it could be used to support charts made by saved queries. On a web page you could attach a link to a selection window with those selections under the chart what was created with that selection. That would enable user a straighter path to expand insights into that data in question according to his or her needs. Variable code to a PX file format. It is already included in API queries? Could user monitoring logs include also IP addresses and session ID s?