Skip to main content

Power BI

New

Data Connector for MS Word Documents (.docx)

Vote (34) Share
André Luiz's profile image

André Luiz on 17 Apr 2018 09:10:48

Create a connector for MS Word Documents (.docx) in which the user can choose to conect to the text contents, table contents, etc.

Many companies keep standard reports in .docx and it would be nice to import them in Power BI/Power Query to analyse the data as word clouds for example.
Besides that, the .docx format is a zip form of xml, so this would not be difficult to implement.

Comments (5)
André Luiz's profile image Profile Picture

c20ca514 ba45-4c0e-866e-833c70810960 on 06 Jul 2020 00:05:10

RE: Data Connector for MS Word Documents (.docx)

This could be very helpful, definitely something I would use.

André Luiz's profile image Profile Picture

34551a20 9380-48ad-bdf6-b8acce3476e3 on 06 Jul 2020 00:02:09

RE: Data Connector for MS Word Documents (.docx)

I think if you use R, there are tools that can extract all of the words into a vector. Look at library(readtext) or read_docx from qdapTools https://www.rdocumentation.org/packages/qdapTools/versions/1.3.3

André Luiz's profile image Profile Picture

4e775507 c6e0-496c-9eae-ac16c222c39b on 06 Jul 2020 00:00:02

RE: Data Connector for MS Word Documents (.docx)

In the Quality area, many documents written in .docx contain tables, filled forms and completed questionnaires. It would be very useful to be able to easily extract all these data to make comparisons, statistics and monitoring

André Luiz's profile image Profile Picture

4e775507 c6e0-496c-9eae-ac16c222c39b on 06 Jul 2020 00:00:01

RE: Data Connector for MS Word Documents (.docx)

In the Quality area, many documents written in .docx contain tables, filled forms and completed questionnaires. It would be very useful to be able to easily extract all these data to make comparisons, statistics and monitoring

André Luiz's profile image Profile Picture

8293a8bc b2d6-4bc3-b77b-4ea2f2b138d9 on 05 Jul 2020 23:56:04

RE: Data Connector for MS Word Documents (.docx)

This would be very helpful especially given that MS Word usually contains a lot of valuable information like a glossary of terms or any text rich data/ information.