The Importance of Uniform Standards (testing link reference format on xlog)
On April 10, 2024, from 14:00 to 16:05, I spent 2 hours tinkering and finally managed to run PyHanLP successfully! Now, let me record the knowledge I have learned👇
In order to complete a part of my graduation project - extracting unstructured information from text - I needed a tool that could understand semantics. So today, I installed PyHanLP for my Python.
Now, let me share the problems I encountered. If I have time, I will introduce what HanLP is. If not, then forget it.
In this article, I will prove to you through my own experience that the absence of a concept can make your work more complicated! If I could have valued this concept and understood its impact, I think I would have chosen to learn Conde, a professional tool specifically designed for Python virtual environments, and then downloaded the HanLP third-party library.
Precautions for Foolproof Installation of HanLP#
I used the official website's foolproof installation package. Little did I know that this foolproof installation package treated me like a fool. It didn't even let me choose the installation directory and directly installed the JAVA environment, HanLP, and a new Python 3.8 with just one click!
And all these things were installed on the C drive!
My original Python 3.12 couldn't call PyHanLP. This PyHanLP can only be called by the Python 3.8 installed by the foolproof installation.
To solve this problem, I tried to replace the Python 3.12 in the virtual environment of my project with Python 3.8. However, after the replacement, the code I had written no longer worked - the reason being that these libraries were not installed in Python 3.8.
It was then that I realized that the Python version in the virtual environment is also specified. Even if I change the interpreter to Python 3.8 in the virtual environment, it cannot use the pip library installed in Python 3.12 in the virtual environment because these libraries only recognize Python 3.12! So I had to create a new virtual environment.
Then I created a new virtual environment, called Python 3.8, and copied all the previous code over. Then I re-downloaded the third-party libraries needed in these codes. However, the problem appeared in front of me again.
I tried to run PyHanLP and found that the terminal couldn't find this library. I understood again that PyHanLP is a library in the global environment and cannot be called in the virtual environment.
The problem is, because I can't download it using pip install pyhanlp
in the virtual environment, that's why I chose the foolproof installation package to install PyHanLP. In other words, I can't use PyHanLP in the virtual environment. I can only use Python 3.8 on the C drive and install the libraries on the C drive!
To solve this problem, I had to give up using the virtual environment and directly use the global environment. I needed to add Python 3.8 to the computer's environment variables, and the priority of Python 3.8 must be higher than that of Python 3.12. Only in this way, the pip install
command used in the terminal will target Python 3.8.
If it's a virtual environment, all the libraries I install will be installed in the folder where my code is located - of course, I won't put them on the C drive.
It is these obstacles that made me finally understand the true meaning of a virtual environment - having its own pip library and working environment independent of any other Python interpreter!
My C drive will be burdened with downloading more libraries, and it already has limited memory. All of this is because I cannot install PyHanLP in the virtual environment! It's all because I didn't understand "virtual environment" before.
Although I am also a beginner, at least from today's experience, I may be able to give some advice.
If you are a Python beginner: If you are going to download many Python third-party libraries, and you are prepared to use Python frequently to help you with your work, and you don't want your C drive to bear too much burden (want to store things on other drives), then go ahead and download Conde honestly (Conde is a platform specifically designed to manage Python libraries and create virtual environments).
If you also want to use the HanLP library, the official website also provides a conda download solution. It can be installed with just two lines of code (not more complicated than the foolproof installation package). And I guess that if the installation path is configured in conda, then the PyHanLP it installs will not be foolishly installed on the C drive!
What is a Virtual Environment#
In short, a virtual environment is choosing to use a specific version of Python with its own independent pip library.
- A virtual environment is like installing a new Python, but it doesn't actually take up much memory since it's not a real installation.
- If your project has activated a virtual environment, you can only use the Python libraries installed in that environment. Even if your original Python has many libraries, the Python in the virtual environment cannot call them.
- Virtual environments are used to separate the libraries required for one project from those required for other projects, saving space for each project file.
How to Create a Virtual Environment in VSCode#
Create a dedicated Python virtual environment for your project in VSCode:
- Open VSCode and open the folder.
- Click on the Python editor in the bottom right corner of VSCode, and in the pop-up options, select "Create Virtual Environment". Click to create a virtual environment and choose the Python version you intend to use.
- Open a new terminal and enter
. \venv\Scripts\activate
to activate the virtual environment, which is actually entering this directory in the terminal. It is worth noting that you don't need to activate the virtual environment again the next time you open this folder with VSCode. The software will automatically enter it.
The Outcome of This Matter#
I don't plan to download Conde to reinstall HanLP, but I also don't plan to keep using everything configured on the C drive by this foolproof installation package.
I don't have more time to learn how to use Conde, not only because it has an English interface, but also because half a day is very scarce for me. I have many things to do.
After I pass the postgraduate entrance examination, I will come back and operate this great tool properly! Now, I can only compromise and complete my graduation project on the C drive.