I am an open source, full stack developer with years professional experience, working primarily with Java and web technologies, and specializing in natural language processing (NLP) and speech technology.
Born in New Zealand, I attained a BA in Philosophy at the University of Auckland and an Advanced Certificate in Business Computing at Auckland Institute of Technology (both in 1996), a BSc in Linguistics at the University of Canterbury (in 2003), and a BA(Hons) in Linguistics at the University of Canterbury (in 2013).
My studies gave me, among other things, a good theoretic grounding in logic and computer science, the practicalities of software development in a variety of paradigms and languages, and a broad understanding of linguistics, including sociolinguistics, syntax, phonology, and speech processing.
I have more than two decades of professional experience working in various roles and using a variety of technologies. Starting as a C/C++ programmer in the telephony and internet connectivity industries, I've also had roles managing the development and architecure of 'traditional' RDBMS based client/server systems using VB6, C#, and SQL Server, as well as web development with MySQL, PHP, JSP, and Node, culminating the sole-charge development of a Java-based open source linguistics research tool, called LaBB-CAT, for the University of Canterbury, which has also been adopted in a number of other academic and industry settings internationally.
Having many years of experience in software design, implementation, and testing, relational database design, data and infrastructure migration, front-end user-experience design, web-based APIs and apps, and cross-platform mobile app development and deployment, I have often been entrusted with end-to-end management of software projects.
Working in both academic and commercial environments has required me to develop a range of communication skills, allowing me to interact effectively, both face-to-face and in writing, with principal investigators and founders in order to act as a bridge between research needs and technological possibilities, and also develop and present training materials for research assistants and operators. My experience with clinical speech elicitation apps has also required a deep empathy for one-time study participants who have little prior knowledge of what's required of them, in order to develop a user experience that is self-explanatory and maximally usable by participants who may have only superficial experience with using mobile and web-based technology, and who may have impairments that might otherwise make participation difficult.
I have also published a number of academic articles, and presented at conferences, in the field of Corpus Linguistics and Speech/Language technology.
I've lived in Buenos Aires, Argentina since 2004, where I have been working remotely for years.
Summary of Skills:
years experience, including the following technologies:
- Java Server Pages (JSP)
- JSP Standard Tag Library (JSTL) and Expression Language (EL)
- custom tag libraries
- WebStart and standalone applications
- Java Database Connectivity (JDBC)
- Java Media Framework (JMF)
- Play Framework
- ANTLR for domain-specific-language grammar implementation
- Web Technologies
years experience with client- and server-side coding of websites and web
applications, including the following technologies;
- CSS3, including responsive web design, flex, and keyframes
- JQuery, including JQueryUI and JQueryMobile
- Appcelerator Titanium
- Apache Cordova
- NodeJS with Express/Handlebars
years experience including development of stored procedures, user defined functions,
triggers, etc.), using the following DBMS's:
- MS SQL Server 7/2000/2005/2008
- MS Access
- Some professional exposure integrating with R programs and authoring an R package.
- Some professional experience using Python-based NLP tools and integrating Jython with Java-based systems.
- 15 years object-oriented programming, under Windows XP/NT/2000,Vista, 7, etc. (including development of web services, SQL Server data access)
- 3 years experience under MS-DOS, Unixware(2.1), Solaris, Windows 95/98/ME/NT/2000
- Objective C
- Some experience under developing iOS plugins for Apache Cordova applications.
- Visual Basic
- 15 years experience, including Object Linking & Embedding (OLE), Dynamic Data Exchange (DDE), database access (using MS Access and MS SQL Server), calling the Windows API, and integration with .NET applications, using VB6
- Some professional experience using this server-side scripting language to implement various web-based applications, connecting to MySQL, PostGreSQL and ODBC data sources.
- 2 years experience, under MS-DOS, Unixware (1.1), AIX, Linux
- Some experience writing extensions for Emacs.
- Microsoft Windows
- 3.1, 3.11, 95, 98, ME, NT, 2000, XP, Vista, 7, 10
- Macintosh OS X
- 10.8, 10.9, 10.12
- Ubuntu Linux (12.04 - 18.04), Amazon Linux, plenty of experience, writing shell scripts, using the file-system, and standard utilities (grep, sed, vi, etc.)
Advanced Certificate in Business Computing
Auckland Institute of Technology, 1996
Bachelor of Arts
University of Auckland, 1996
Bachelor of Science
University of Canterbury, 2003
Bachelor of Arts with First Class Honours
University of Canterbury, 2014
New Zealand Institute of Language, Brain and Behaviour (NZILBB), University of Canterbury
- Professor Jennifer Hay, +64 3 364 2987 ext 6242 (email@example.com)
- Duties include
Design, development and maintenance of LaBB-CAT, a multimodal language annotation store and corpus management system, for storage and manipulation of speech and language data to facilitate a wide variety of research projects. The system supports integration with commonly used 3rd-party tools used by researchers in this field, facilitating corpus-building using existing and new data, and extraction for processing, analysis, and visualization. It also facilitates automatic annotation of large amounts of data en-masse, reducing drudgery and increasing the rapidity with which research outcomes can be reached.
The system enables language research at a speed, scale, and accessibility that was previously impossible in this domain, and has been adopted in a number of research institutions including:
- the NZILBB at the University of Canterbury (Jennifer Hay, Margaret Maclagan, Jeanette King, Kevin Watson, Lynn Clark, Megan McAuliffe),
- the Glasgow University Laboratory of Phonetics (Jane Stuart-Smith),
- Arizona State University (Visar Berisha, Julie Liss),
- ZAS Berlin (Stefanie Jannedy),
- University of Hawaiʻi at Mānoa (Katie Drager),
- Adam Mickiewicz University in Poznań (Kamil Kaźmierski),
- Medical University of South Carolina (Boyd Davis, Charlene Pope),
- University of Oxford (Sarah Ogilvie),
- Griffith University (Gerry Docherty), and
- Australian National University (Ksenia Gnevsheva).
This work includes:
- media handling using Java APIs and HTML5 elements,
- automatic annotation modules, including integration of existing lexical databases (CELEX, Unisyn, CMU Dict) to provide advanced searching capabilities,
- machine learning modules, such as using HMM Toolkit (HTK) to train voice recognition models for time-alignment,
- security integration,
- server-side and client-side integration with 3rd party tools, using web-based APIs, Java applets and browser extensions
- management of conversion of audio and other data between different formats,
- presentations and training courses/materials,
- documenting the project in online help, journal articles and conference presentationas.
Design, development and maintenance of open-source mobile apps for eliciting speech samples for direct upload to LaBB-CAT, for Android, iOS, and browsers. It presents participants with a series of screens that ask questions, present texts to read, or textual/visual stimuli for eliciting spontaneous speech, records their answers and speech, and uploads the resulting data to LaBB-CAT.
The app has been used in a number of research projects, including the development of a corpus of dysarthric speech, a migraine-diary study, and gathering speech and attitude data from the trans community.
- Production and maintenance of Excel spreadsheets, editor plugins, and other tools to aid academic staff and research assistants.
- User support for academic staff, collaborators, post-docs, and research assistants.
- Server administration and maintenance, including software installation and upgrade, data and web-app migration, administration of GitLab repositories, and facilitation of the provisioning of computing and storage resources.
- Doctor Visar Berisha, firstname.lastname@example.org
- Duties include
- Development, maintenance, and deployment of speech corpus back end, including integration with proprietary annotation software.
- Design, development, and maintenance of speech data collection and annotation mobile apps, including 2017 Scrip Award winning app 'SpeechAssess'.
- Microservices infrastructure design and development using Amazon Web Services (AWS) technologies.
PatronBase (formerly Solution Architects)
- John Caldwell, +64 21 663 731 (email@example.com)
- Duties included
- Maintenance of current software modules (in VB6 and C#, on Windows systems, including the ticketing system PatronBase), from UX through to backend database design, including bug-fixing and feature enhancements, as well as porting from MS Access to MS SQL Server data access and maintaining reporting tools using Crystal Reports
- Production of install sets (using InstallShield Express and MS Developer Studio)
- Specification, design, implementation, and testing of new tools, software modules, products, and web applications, using VB6, C#, Java, Cold Fusion, MS Access, MS SQL Server, MySQL
- Nick Egerton, +62 9 308 2500 (firstname.lastname@example.org)
- Duties included
- Development of a good understanding of common internet protocols, from the fundamentals of IP packet structure, to higher-level protocols such as FTP, PPTP, HTTP, etc.
- Maintenance of the WinGate engine (Internet proxy server), in C++
- Maintenance of WinGate's GUI configuration and monitoring tool, GateKeeper, in C++ using MFC
- Maintenance of the WinGate Internet Client, a Layered Service Provider DLL for internet connectivity through WinGate using Qbik's Winsock Redirector Protocol (WRP), in C++
- Maintenance of the WinGate WWW Proxy Authentication Client in Java
- Mainenance of WinGate's Network Address Translator (NAT), in C, using NuMega's SoftIce for network driver debugging, and Microsoft's Network Monitor for analysis of IP packets
Boulevard Web Systems
- Warwick Schaffer, +64 3 365 6480 (email@example.com)
- Duties included
- Design, development and maintenance of an internet-based Application Service Provider platform (netStep) and various application modules (including an internet-café management system and a point-of-sale/inventory system). This includes Java Servlets, JSP, and PHP web application development, LDAP directory design, maintaining Delphi client applications, and prototyping clients using Java technology. An XML middle-ware component and a SOAP server have also been developed for client/server communication.
- Development of internet applications internally for Boulevard Web Systems, including development of a Java applet/application for planning and project management (using Swing components and JDBC to connect to a MySQL database)
- Del Robinson, +64 3 385 3856 (Del@OmegaTech.co.nz)
- Duties included
- Assistance with development of a Java applet for graphically presenting data provided by a CGI application
- Robert Fromont (2019) Forced alignment of different language varieties using LaBB-CAT, in Proceedings of the 19th International Congress of Phonetic Sciences, pages 1327 – 1331
- Robert Fromont (2017) Toward a format-neutral annotation store, Computer Speech & Language. DOI: http://dx.doi.org/10.1016/j.csl.2017.01.004, Available online 9 February 2017.
- Robert Fromont & Kevin Watson (2017) Factors influencing automatic segmental alignment of sociophonetic corpora, Corpora. Volume 11 Issue 3, Pages 401 – 431 DOI: http://dx.doi.org/10.3366/cor.2016.0101, ISSN 1749-5032, Available Online January 2017.
- Robert Fromont & Jennifer Hay (2012) LaBB-CAT: an Annotation Store, in Proceedings of Australasian Language Technology Association Workshop, pages 113 – 117
- Robert Fromont & Jennifer Hay (2008) ONZE Miner: the development of a browser-based research tool, Corpora. Volume 3, Pages 173 – 193 DOI 10.3366/E1749503208000142, ISSN 1749-5032, Available Online Nov 2008.
Public Code Repositories
- LaBB-CAT - a corpus annotation store
- The nzilbb.labbcat R package published on CRAN.
- ElicitSpeech (Titanium) and ElicitSpeechWeb (Cordova) for speech elicitation
- nzilbb.ag API for annotation graph conversion and processing
- BAS Service integration for Java
- JSendPraat and WebSendPraat for browser intergation with the Praat speech tool.
- Hexagon, a flexible, module content management system.
- Oxygen plugins for standardized TEI transcripts of written texts
- Artificial Intelligence - I think my interest in AI is what initially drew me into Computing, and also into Philosophy, and is part of the reason I've pursued machine learning and NLP. My computer is, as yet, not self aware, but I continue to study and ponder the subject.
- Argentine Tango - this intimate dance is the original reason for moving to Argentina. I've been dancing tango since January 2001, and have since found never-ending challenges and delights in its intricacies and subtleties.
- Spanish - stemming initially from my interest in tango, and ultimately from day-to-day necessity, learning to speak Spanish has been at once challenging and fascinating. Reading my favourite author, Jorge Luis Borges, in his native Spanish continues to be an illuminating experience. Some informal and professional translation work keeps me extending my vocabulary and grammar.