CYBERINFRASTRUCTURE

This project has focused on how to build an Army of Citizen Engineers and enable them to collaborate in various aspects of the integrated design chain (IDC) by developing cyberinfrastructure, policies and social structures that (1) harness human effort, (2) tap collective knowledge, (3) pool communal software and (4) leverage distributed computational hardware – what we term four Dimensions of Collaboration (DoC).

To do so, the project team has development, over the past three years of the project, five major cyberinfrastructure portals and utilized one commercial cyber-platform to execute six experiments, with a seventh on Mechanical Turk now underway. These portals provide the primary venues for research and educational activities. The portals and the experiments they facilitate allow three features to be explored:

  1. crowd performance given varying expertise and demographics
  2. the dimension of collaboration
  3. the phase of the integrated design chain being targeted

Figure 1.1 summarizes the Dimensions of Collaboration (DoC) and the Integrated Design Chain concepts. We will first summarize the five platforms developed by the team, the first three of which were developed in prior project years, and provide a more detailed description of the new platforms/experiments launched over the past year of the project.

Crowdsourcing Component Design
IDC Phase: Component Design
DoC: Collective Knowledge & Crowdsourcing
Open Research Question:  Do ratings correlate with task performance?
Testbed: LRFD/ASD Steel Component Design Queue
Crowd Demographics: 23 structural engineers, Jan-April 2010
Portal description: Portal with task access based on citizen engineer rating and dynamic visualization of aggregated & clustered crowd results to end user
Policy and Governance Features: A dynamic citizen engineer rating scheme, incentivization by tournament mobility
Crowd Sociology Investigated: Correlation between member rating and quality of technical work

Crowdsourcing Data Collection
IDC Phase: Peer Review
DoC: Collective Knowledge & Crowdsourcing
Open Research Question: Can cellular platforms be used for collection of technical data (participatory sensing) by non-expert crowds? And could such crowds be attracted to such tasks at little or no personal benefit?
Testbed: Infrastructure photos portal (Participatory Sensing of Civil Infrastructure)
Crowd Demographics: Aug. 2010, 12 days, 170 photos, 25 users, 30 cities in 6 states
Portal description: Digital devices interfaced by getotagging or Yahoo API address and translation to web portal using Google API for visualization

Policy and Governance Features: Referral systems using social networks
Crowd Sociology Investigated: Factors motivating voluntary participation and level of attrition among crowd members

Crowdsourcing Data Analysis
IDC Phase: Analysis
DoC: Crowdsourcing
Open Research Question: Can non-expert crowds reliably execute engineering analyses?

Original Testbed: 2010 Haiti Earthquake Photo Classification Portal
Crowd Demographics: Fall 2011, 17 days, 242 undergraduates, random assignment for 9318 classifications of 400 photos
Portal description: web portal supporting paraskilled tasks and multiple gateways, randomized photo assignment, automated tracking of online activities, real-time crowd standings
Policy and Governance Features: classification schema and tutorials, quantification of trustworthiness
Crowd Sociology Investigated: role of moral and utilitarian incentives in predicting quality and quantity of crowd contributions

Original Site: Haiti earthquake photo tagging:  http://citizenengineers.org/study/

Commerical Testbed: Mechanical Turk 2010 Haiti Earthquake Photo Classification Portal
Crowd Demographics: Anonymous Turk workers (over 2000 assignments to these workers)
Portal Description: Mechanical Turk (Amazon Commercial Portal)
Policy and Governance Features: simplified workflow and three layer adaptation of schema, pre-qualification quiz
Crowd Sociology Investigated: demographics and quality of assessments by diverse, anonymous crowd pre-assembled on a commercial crowdsouring platform

New Activities: A new experiment on Crowdsouring for urgent human computing was conducted by the team to see how on-line platforms with pre-assembled crowds perform in crowdsourcing challenges. Thus Amazon’s Mechanical Turk was employed and the previous schema and photo data set from the 2010 Haiti Earthquake Photo Classification Portal was streamlined and simplified and farmed to Turk Workers. This effort differed from others done on this platform since it outsources technical engineering tasks to “Turkers” of a diverse background.  Project design included three major changes (1) Pre-Qualification and collection of demographic data, (2) Bonus payments for good work, and (3) a layered workflow to simplify the task when compared to the earlier portal. We also investigating mechanisms to enhance worker response to our “HIT” such as bumping, bonus payments and altruistic task tagging. The workflow was developed to farm the photo tagging experiment to Mechanical Turk and then automatically collect and analyze the data in pseudo-real time outside of the commercial platform using algorithms we developed to evaluate worker quality and crowd consensus through evaluations of assessments conducted by independent workers on the same photo. This approach is schematically reviewed in Figure 1.2  along with an example of one of the qualification HITs. The process was rolled out in three layers in the summer of 2012 and results are now being analyzed to see how the Turk worker performance compares with the previous controlled and recruited crowd in the prior experiment.

Figure 1.2: Schematic of Mechanical Turk Experiment Toolset (top) and screen capture of one pre-qualification test.

Data from the original experiment using the in-house platform also continues to be mined and analyzed. New activities in this area included ground truth experiments with a panel of experts to support automated processes to assess the quality of crowd contributions as well as studying crowd sociology and whether specific feedback of information was found to influence the crowd commitment and quality of contributions.

Crowdsourcing Engineering Conceptual Designs
IDC Phase: System Concepts
DoC: Crowdsourcing Open Research Question: How can crowd creative contributions be automatically evaluated and aggregated? How can innovation in particular be measured and does that level of innovation change in team environments?
Testbed: Shelters for All Competition (www.sheltersforall.org)
Crowd Demographics: Dec 2011 to Jan 2012, 10,000 site visitors, 705 registered users from 70 countries, 100+ submissions received (25% by teams)
Portal description: Submission interface, design gallery, evaluation tools, automated ranking
Policy and Governance Features: multi-dimensional assessment rubric, self-assessment
Crowd Sociology Investigated: reliability of crowd assessments, individual vs. team performance

Detailed Description: Crowd submissions of conceptual designs can be exceptionally challenging to evaluate and thus pose the common bottleneck for effective Crowdsouring and automation of such activities. Thus this experiment attempted to deliver something that most conceptual crowdsourcing efforts lack and that would be critical to the IDC: automated assessment of crowd submissions by the portal. To do so, a highly creative design challenge had to be devised that could get wide participation from a diverse crowd. Thus a challenge was developed to ask citizens at large to propose housing designs for the developing world that satisfied several dimensions assessing safety, feasibility and even sustainability of the concept. This was the most ambitious experiment undertaken since it was open to the public around the world. Thus activities focused on a number of tasks leading up to the launch to collect the data necessary for both engineering and social sciences. These included:

  • Developing the website portal  (FAQ, terms of use, registration system)
  • Developing entrance and exit survey instruments
  • Designing and implementing a marketing effort involving social media, blogs and cooperative efforts from a number of professional organizations and groups
  • Negotiating Terms of Use and IP
  • Developing comprehensive project brief
  • Developing the expert, self and peer evaluation systems
  • Recruiting and training expert evaluators
  • Selecting competition winners
  • Collecting and arranging all the data collected through the competition (registration information, entry and exit survey, submission data, peer evaluations)

The platform was launched in the late fall of 2011, with promotion by our team using social media and professional networks of likely crowd communities. The portal landing page is shown in Figure 1.3. These crowd members were required to complete entrance and exit surveys and conduct peer assessments using the established rubric. The screen shot in Figure 1.3 shows the progress each of the 100 fully compliant submissions had to complete.

Figure 1.3: SheltersforAll.org landing page (Top) and user account progress bar showing stages required for all eligible crowd members.

Submissions were also evaluated by a trio of experts for the ground truth assessment: an engineer, architect and cultural expert from the country in question. Their ratings allowed the winner of the competition to be identified, but also more importantly, to explore the correlation with peer and self-assessments across various dimensions embodied in the rubric. Various scoring schemes and innovation measures were also developed to rate the submissions, all suitable for future embedment in crowdsouring platform. The portal closed in the spring of 2012 and a design gallery is currently in development to house the over 100 submissions into a searchable database.

Portal: http://sheltersforall.org/ 

Crowdsourcing Expert Tasks
IDC Phase: Analysis
DoC: Shared Software & Hardware
Open Research Question: Can advanced simulation tolls running on cloud resources be operated by crowds?
Testbed: Computational Fluid Dynamics (CFD) on the cloud
Crowd Demographics: Two experiments in Nov 2011 and April 2012 involving 20 skilled engineers
Portal description: workbench with back end cloud resources and front end GUI, employing open source software (OpenFOAM)
Policy and Governance Features: optimized template for advanced settings, 3-step workflow, tutorial
Crowd Sociology Investigated: crowd expectations for ease of use, wait time

Detailed Description: Since a major aspect of crowdsourcing to support Civil Engineering tasks is the need to support participation for tasks with wide ranging complexity, portals must also be developed for expert citizens who have unique needs that may be different from those of amateur citizen engineers. For the past two years, a portal has been developed under Task 2 of the project that focused on engaging expert citizens in a sophisticated analysis task. For users with such complex requirements, portals must support a pre-certification process (knowledge quiz) and detailed work specifications and supporting documentation. Commonly expert Citizen Engineers require access to advanced simulation tools with substantive computational demands that exceed the capabilities of their personal work stations. For this reason, a back-end simulation platform is developed that takes the parameters submitted by users and generates a computational model and executes the requisite simulations. Moreover, complex and domain-specific analysis tools that are often necessary for these analyses have a very steep learning curve and would be inaccessible for most Citizen Engineers. Such an example is afforded in this case study by a Computational Fluid Dynamics (CFD) analysis of structures under the action of turbulent winds. While CFD provides a powerful simulation tool, it is generally not easy to employ by even trained engineers. This portal and the two experiments that have been run on it, attempts to demonstrate the capabilities of well-engineered cyberinfrastructure and web interfacing linked to distributed computational resources to reduce the technological barriers that could isolate Citizen Engineers from the resources necessary for their work. The platform permitted these users to simulate turbulent flow for multiple scenarios to explore the influence of grid density, with flexibility to specify mesh parameters and simulation time steps.

The platform had three elements, the underlying open source CFD package (OpenFOAM), the front end user interface which allows users to input a variety of parameters and visualize results, and a distribution system that serves as the dispatch controller to send simulation jobs to the distributed hardware resources on the cloud (see Figure 1.4). The design of the portal included a number of features to make the technology accessible to crowds: introduction of standardized templates, default settings, user-friendly and highly visual modules, well-organized work flow divided into a logical series of steps, and ability to visualize, download and monitor progress. A sampling of the web-portal interfaces facilitating these features are shown in Figure 1.5.

This portal also explored the challenges associated aggregating the submissions from highly complicated crowdsourced activities and provided submission assessment criteria associated with experimental set up, data generation and output representation, and flanked these with a simulation quality indicator that compared the pre-qualification results with those that have been published in the literature, providing a mechanism for objective evaluation even when results interpretation and discussion cannot. Deviation from this litmus test would then reduce the importance ranking applied to the submissions from that Citizen Engineer. This portal was deployed in two graduate level courses in Civil Engineering at Notre Dame in 2011-2012.

Currently the portal is being used for a three Level non-expert crowd experiment on Amazon Turk that will conclude in August.

Figure 1.4: Schematic of back side dispatch controller

Figure 1.5: Examples of screen shots for OpenFOAM Platform to allow specification of simulation parameters and visualization of results.

Website: http://workspace.crc.nd.edu/