Mid-point project progress blog post

 – What was your original internship project timeline?

I’m happy to share my original internship project timeline here:

Stage 1:

[1] May 30 – June 12: Optimize code from contribution task 3 to accurately match the first name, middle name, and last name of the author in English and English-alike languages, and automatically add author names to Wikidata.

[2] June 13 – June 26: Optimize code from contribution task 3 to tackle the condition that an author has multiple names, such as multiple given, middle, or family names in English and English-alike languages, and add all the names to Wikidata.

[3] June 27 – July 10: Optimize code from contribution task 3 to match multilingual given/family names in German, Japanese, Chinese, etc., and add the name corresponding to the language on Wikidata.

Stage 2:

[4] July 11 – July 24: Import and be compatible with as many scientific databases as possible to match author names from, and automatically detect links of BibTeX or other scientific databases on a Wikidata item.

[5] July 25 – August 7: Automatically fix author name errors on Wikidata from the source of scientific databases, such as using NLP to automatically correct mistyped/misspelling of the author names on Wikidata.

[6] August 8 – August 21: Automatically create a given/family name item on Wikidata from the source of scientific databases if the name does not exist on Wikidata.

Stage 3:

[7] August 22 – August 24: Discuss with mentors for suggestions about works and submit large runs to Wikidata to finalize works.

[8] August 25 – August 26: Write a summary about the project of author names on Wikidata

Note:
1. The implementation of new progress should be tested in a sandbox/testing environment, then push/submit to the production environment on Wikidata.
2. After stage 1, submit a request for bulk runs to the bot, while start working on stage 2 and waiting for approval from the bot.
 – What goals have you met?

From my side, I think I’ve met [1] and [2] original goals I’d set.

For [1] and [2]’s solutions, I solved them by only looking at the surnames of an author, and the rest can be deemed as given name property on Wikidata.

You can see my contribution on these tasks here: https://www.wikidata.org/wiki/User:Feliciss/StrategyOfNames
 – What have you accomplished in the first half of your internship?

Other than the names I’d met and accomplished in the first half of my internship, I also matched several statements in an article that may not be present in current papers on Wikidata such as page(s), publication date, published in, etc.
 – What project goals took longer than expected?

In my point of view, the [3] project goal took longer than I expected.
 – Why did those project goals take longer than expected?

Because there’s only a small portion of names in Chinese and Japanese in the ADS database.
 – What would you do differently if you were starting the project over?

I would only focus on names in different languages from different name databases.
 – Which original goals needed to be modified?

The [4] and [5] needed to be modified as there’s no time to do that during the internship.

The [4] goal needed to be modified to only focus on the ADS database, and [5] is supposed to be removed from my original goals.
 – What is your new plan for the second half of the internship?

My new plan for the second half of the internship would be:

1. Complete the task of adding statements from ADS to an existing article on Wikidata. Work towards a decent coverage of all the present articles on Wikidata.
2. Submit a bot quest to the community.
3. Work on adding properties related to the statements. The bot will be started after this task is finished.
4. (Optional) if time is allowed, work and research on author names in different cultures, such as Chinese and Japanese. Then repeat 1-3 tasks listed above.

By:

Posted in:


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create your website with WordPress.com
Get started
%d bloggers like this: